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Title : "Transcription Factor Target Gene Discovery" 

1.0 FIELD OF THE INVENTION 

The following invention describes the utilization of solid matrix binding technology in 
combination with sequential chromosomal imrnunoprecipitation and molecular cloning technologies 
to discover and characterize transcription factor target genes. 

2.0 BACKGROUND OF THE INVENTION 

The specific regulation of transcription within the nucleus of the cell is one of the basic facets 
of the cellular machinery and is known to be the implicit foundation behind all cellular 
characteristics. The ability to differentially regulate the activity of each of the estimated 26,000 
genes depends upon the presence or absence of various transcriptional activator and/or repressor 
proteins (Venter et al., Science, 2001, 291(5507): 1304-1351). Figure 1 illustrates this concept for 
the steroid receptor class of transcription factors. Steroid receptors, represented by the rectangles, 
dimerize and bind to target gene regulatory regions. In the presence of a steroid ligand depicted by 
the ovals, target genes are activated. In the absence of ligand the receptor is bound to compressor 
machinery and target genes are inactive. 

Interactions between these as well as other factors and their target loci have evolved over 
time into a complex series of temporal and biochemical events which governs transcription under 
tight regulatory constraints (for review see Semenza et al., Human Mutations . 1994, 3: 180-199). 
It is the interaction between these factors and sequence-specific regulatory elements that has shed 
insight into the mechanisms by which cells keep such entities as cell division, differentiation and 
immunomodulation in check. By deciphering these genetic cascades and ultimately defining 
transcription factor targets it will be possible, for example, to determine just how tumor suppressing 
transcription factors such as p53 (Zambetti et al., Genes Dev. . 6: 1143-1152, Zauberman et al.,— 
Oncogene, 1995, Jun 15;10(12): 2361-6) and Rb (Friend et al., Proc.Natl. Acad. Sci. TTSA . 1987, 
84: 9059-9063) exert their effects on both inhibiting cell division and promoting cell death. Indeed, 
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the value of transcription factor/regulatory interactions is evidenced by the wealth of patents recently 

issued in the United States relating directly to the factors themselves or technology pertaining to 

gene regulation (a representative set of examples includes U.S. Patent #'s 5,53,036, 5,858,973, 

5,863,757, 5,880,261 and 6,1 17,638 herein incorporated by reference). 

The identification of transcription factor target loci requires an assessment of protein/DNA 
interactions in vivo. The chromosomal immunoprecipitation (ChIP) assay has been demonstrated as 
a method which successfully allows for the purification of in vivo protein/protein interactions which 
occur in combination with DNA regulatory elements as well as direct protein/DNA interactions from 
cellular extracts of either cytoplasmic or nuclear origin (Solomon et al., Cell . 1988, 53: 937-947; 
de Belle et al., Biotechniques . 2000, 29(1): 170-175). It is based upon the chemically catalyzed 
cross-linkage of biochemical interactions in living cells followed by purification of desired 
complexes from nonspecific contaminants. To date, use of the CMP assay has proven to be of value 
for the assessment of transcription factor complex recruitment to particular nucleotide sequences of 
known origin. By determining the presence or absence of a particular transcription factor on a 
known DNA sequence or binding site present within a particular gene, for example, it is possible to 
establish whether specific known genes are targets for regulation by chosen factor. However, in 
order to identify previously uncharacterized or undiscovered targets for potential regulation by a 
particular transcription factor a number of advances in the technology must be achieved. For 
example, efficient recovery of quantities of DNA large enough to allow for cloning and sequencing 
of the potential transcription factor targets must occur. In addition, an optimization for the 
opportunity to isolate transcribed portions of genes and eliminate noncoding genomic sequences 
which often do not reveal the identity of the target gene must be accomplished. Finally, high- 
throughput organization of sequences obtained into a searchable database format should be 
undertaken to provide for maximum utility of the discovery transcription factor target genes. 
Incorporation of the modified, significantly improved ChIP assay into the described present 
invention in combination with molecular cloning methods now allows for the high-throughput 
isolation and characterization of both known and unknown transcription factor target loci. In 
addition, these modifications and improvements increase the sensitivity of target gene retrieval while 
simultaneously reducing background. • 

Solid phase technology has had a significant impact on the efficiency and sensitivity of 
protein complex purification. Compounds such as sepharose and magnetic beads have allowed for 
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ex™p^ fi cation and characterization of protein/protein complexes fro^ 
vzvo mixtures, without compromising the quantitative or qualitative aspects of samples obtained 
(Dynal Corporation Technical Handbook, 1998, Sigma Corporation, cat. #4B-200). It's application 
for the purposes of identifying transcription factor target genes of unknown origin and m a 
high-throughput format, however, has yet to be implemented, It is the use of solid phase technology 
in the presently described invention which significantly increases the sensitivity of obtaining real zn 
vzvo targets for transcription factors while reducing background false positive sequences obtained. 

Through both a combination and modification of the above technologies as well as other 
molecular biology techniques such as exon scanning, inverse PCR and cDNA library screening the 
presently described invention allows for the extensive and exhaustive characterization of 
transcription factor target genes of both known and unknown origin and of a direct (the gene is 
bound by the factor) and indirect (interaction through other proteins) nature. It is the implementation 
of chromosomal immunoprecipitation procedures improved via the use of solid phase support and 
sequential immunoprecipitation for multiple proteins which permits the potential complete and 
thorough analysis of a great deal of the transcriptional cascades present in the nucleus of the ceU. 
The proposed technology described herein is applicable to a very limited quantity of cell or tissue 
samples, which makes it suitable for clinical analysis and comprehensive medical diagnostics. The 
utilization of this technology will no doubt have a significant impact on the fields of therapeutics, 
medical diagnostics and basic research related to the realm of transcriptional regulation. 

^ n .STTMMARV TTTP. INVENTION 

The use of chromosomal immunoprecipitation (OOP) for the identification of targets for 
transcriptional regulation by transcription factors has been limited by both the insensitivity of the 
technology to eliminate considerable nonspecific protein/DNA interactions as well as to the 
discovery and characterization of only previously identified nucleotide sequences. The presently 
described invention overcomes the above limitations of chromosomal immunoprecipitation by 
employing a combination of novel sequential immunoprecipitation procedures utilizing antibodies to 
the basal transcriptional machinery, solid phase separation procedures and extensive cloning 
applicationsincluding a modified and significantly improved version of inverse PCR which allow 
for the discovery of target genes and their regulatory elements. The combination of improved 
chromosomal immunoprecipitation procedures with expression profiling and cloning technologies 
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described in the present invention has allowed for the discovery and characterization of both known 

and unknown target sequences for chosen transcription factors. la addition, the presently described 

technology is highly automatable, allowing for extensive high-throughput analysis of virtually any 

genetic cascade. 

One embodiment of the present invention is the formaldehyde.fixation reaction process 
which cross-links DNA binding proteins with their prospective nucleotide binding sites present 
within close proximity or distal to target genes in living cells and or tissues. This fixation reaction is 
designed and customized specifically for each particular cell line and/or tissue being studied. 

An additional embodiment of the present invention is other chemical methods utilized for the 
purposes of fixing and/or cross-Unking proteins to their prospective target nucleotide sequences in 
vivo directly through interaction with DNA or indirectly utilizing protein-protein contacts. 

Another embodiment of the present invention is the cross-linked protein/target gene complex 
created by the formaldehyde crosslinkage reaction in vivo. Said complex theoretically contains a 
mixture of protein/DNA complexes containing the desired transcription factor or regulatory protein 
directly or indirectly bound to its prospective target loci. 

Another embodiment of the present invention is an antibody which is specific for Drosophila 
melanogaster or Sciara coprophila RNA Polymerase n protein large subunit. The antibody may be 
of monoclonal or polyclonal origin and may recognize similar epitopes from different species. 

Yet another embodiment of the present invention is an antibody which binds specifically to 
the mammalian transcription factor p53. Said antibody may be of monoclonal or polyclonal origin. 

Still another embodiment of the present invention is an antibody-linked to magnetic beads 
which binds specifically to either Drosophila melanogaster or Sciara coprophila RNA polymerase 
IT protein large subunit. It is the solid-phase support linkage which enhances recovery and 
specificity of target chromatin upon immunoprecipitation. 

Another embodiment of the present invention is an antibody which is linked to magnetic 
beads which binds specifically to the mammalian p53 protein. 
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Yet another embodiment of the present invention is the recovered fraction of the cross- 
linked, fixed chromatin protein/DNA complex . 

Another embodiment of the present invention is the sonicated chemically cross-linked 
protein/DNA complex isolated after sonication but prior to irnmunoprecipitation. Sonication allows 
for efficient irnmunoprecipitation of DNA fragment sizes small enough to be characterized in a 
high-throughput format via polymerase chain reaction (PCR) or other molecular biology techniques. 



Still another embodiment of the present invention is the immunoprecipitated protein/DNA 
complex prior to release of the antibody and reversal of cross-linkage isolated utilizing antibodies 
which recognize either the Drosophila melanogaster or Sciara coprophila RNA polymerase EE large 
subunit as well as the mammalian p53 protein. 



An additional embodiment of the present invention is the sequential irnmunoprecipitation of 
cross-linked protein/DNA complexes from living cells and tissues utilizing antibodies to core 
transcriptional machinery factors first and to specific transcription factors second. Sequential 
irnmunoprecipitation eliminates the majority of nontranscribed sequences and satellite DNA by 
focusing only upon transcribed and/or actively regulated genes. It is primary irnmunoprecipitation 
with antibodies to proteins found in the basal transcriptional apparatus which results in increased 
sensitivity through a reduction in the amount of nontranscibed genomic DNA pulled down during 
subsequent irnmunoprecipitation reactions. Theoretically only actively transcribed genetic 
sequences are present as templates for the second round of irnmunoprecipitation. Secondary 
immunoprecipitation with antibodies to specific transcription factors is thereby significantly more 
efficient as is described herein and allows for the opportunity to characterize transcription factor 
function with respect to the regulation of gene activity. These sequential rounds of 
irnmunoprecipitation may be performed in any order with the similar result of increased sensitivity 
for the discovery of transcription factor target genes and decreased background or nonspecific 
sequences obtained. 

— Additionally, solid phase chromosomal irnmunoprecipitation eliminates loss of cross-linked - 
protein/DNA complex material initially precipitated from cellular extracts by providing a solid 
support and thereby enhances the potential ability to recover target DNA fragments and hence the 
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nucleotide sequences corresponding to these fragments. Excessive loss is prevented through clean, 

efficient recovery of antibody/protein/DNA complexes due to tight linkages between the solid phase 

(beads in the case of the present invention) and antibodies. 

Yet another embodiment of the present invention is the utilization of polymerase chain 
reaction (PCR) to detect known target loci within the collection of pull-down fragments which 
putatively contains both known and unknown target genes. It is the detection and monitoring of 
known controls which allows for a characterization of the efficiency of the system. 

As well, an additional embodiment of the present invention is the utilization of inverse PCR 
(I-PCR) in combination with solid phase sequential chromosomal immunoprecipitation for purposes 
of defining only direct targets for regulation by specific transcription factors as well as for 
background reduction. Specifically, oligonucleotides corresponding to transcription factor binding 
sites are used to PCR flanking sequences present in DNA fragment populations isolated by the 
technology described herein. The application of this modified version of I-PCR to sequentially 
irnmunoprecipitated chromosomal templates hence results in the discovery and cloning of direct 
targets for regulation by the transcription factor in question. 

Another embodiment of the present invention is the facilitated cloning of both known and 
unknown target genes from DNA fragments isolated by the presently described methods. These 
potential targets, for transcription factors of DNA binding and nonDNA binding origin, are cloned 
through successive rounds of screening against cDNA libraries and genomic DNA libraries, ligation 
and transfer into bacteriophage and/or plasmid vectors, polymerase chain reaction including but not 
limited to I-PCR and DNA sequencing. 

Yet another embodiment contemplated by the present invention is the screening of 
irnmunoprecipitated DNA fragments potentially containing target loci against libraries, arrays and/or 
microarrays of both known and unknown genes. These libraries and arrays may be of either cDNA 
or oligonucleotide composition. It is the screening of irnmunoprecipitated DNA fragments against 
these libraries, arrays and microarrays which facilitates the discovery of target genes for the 
transcription factor being studied. Said screen allows for aTapid identification ofcoding sequences 
for transcription factor target loci present in the collection of DNA. 
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An additional embodiment of the present invention is the cloning of DNA fragment 
collections containing transcription factor target genes into bacteriophage arms and subsequent 
packaging into particles for the purposes of rapid conventional screening and sequencing. These 
bacteriophage libraries may be screened with known DNA probes or other unknown probes for 
purposes of discovery of target loci. 

Yet another embodiment of the present invention is the cloning of DNA fragment collections 
containing transcription factor target genes into exon scanning vectors which may be introduced into 
eukaryotic cells for purposes of rapidly identifying potential coding sequences within the collection 
of DNA fragments. 

Another embodiment of the present invention includes the nucleotide sequences and 
corresponding amino acid sequences and protein products as determined to be targets for either 
direct or indirect transcriptional regulation. 

An additional embodiment of the present invention is the organization of the nucleotide and 
corresponding amino acid sequences discovered into a database or databases for purposes of rapid 
search and characterization of these sequences for functional and possible therapeutic relevance. 

4.0 DESCRIPTION OF THE FIGURES 

Figure 1 Is a diagrammatic representation of transcriptional regulation by a steroid receptor 
transcription factor (see text for details). 

Figure 2 Is an illustration of the chemistry behind in vivo formaldehyde crosslinkage of nuclear 
protein/DNA interactions (see text for details). 

Figure 3 Is a diagrammatic illustration of the use of antibody-coated magnetic beads for the 
recovery of protein/DNA fragments (see text for details). 

Figure 4 Is a demonstration of the generation of "customizable" fragment sizes by adjustment of 
sonication conditions (see text for details). 
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Figure 5 Is an outline of the technology described in the present invention for purposes of 
discovering transcription factor target genes (see text for details). 

Figure 6 Is a diagrammatic illustration of Exon Scanning. 

Figure 7A-D Is a demonstration of the utility of the described technology and invention through the 
analysis of RNA Polymerase II presence on the Sciara coprophila gene n/9-1 under different 
conditions (see text for details). 

Figure 8 Is a further demonstration of the utility of the described technology and invention and 
demonstrates p53 target gene identification after RNA Polymerase II large subunit "prelP/IP", p531P 
and stringent washing conditions (see text for details). 

Figure 9 Is a diagrammatic illustration of inverse PCR (I-PCR) applied towards DNA fragments 
isolated by methods described herein (see text for details). 

Table 1 Is a listing of two target nucleotide sequences representing regulatory elements identified 
for the transcription factor p53 and the relative induction of transcription from these sequences 
linked to a minimal promoter in the presence of p53 (see text for details). \ 

5.0 DETAILED DESCRIPTION OF THF. INVENTION 

The presently described invention details a methodology for the rapid high-throughput 
identification of transcription factor target genes. It is achieved through the implementation of solid 
phase sequential chromosomal immunoprecipitation utilizing antibodies to both tissue and cell-type 
restricted transcription factors and those of the basal core transcriptional machinery. It is the 
application of this sequential immunoprecipitation which allows for efficient extraction of 
protein/DNA, RNA/DNA and RNA/DNA/protein complexes from living cells and or tissues. 
Combined with the presently described standard as well as modified molecular cloning 
methodologies these techniques result in rapid and thorough identification and characterization of 
transcription factor target loci. Particularly, implementation of solid phase sequential chromosomal 
immunoprecipitation in combination with modified inverse polymerase chain reaction, exon 
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scanning and cloning strategies allows for the identification of direct transcription factor target loci. 
Implementation of solid phase sequential chromosomal immunoprecipitation in combination with 
cDNA library and microarray hybridization technologies also allows for rapid identification of 
transcription factor target genes. 

The utility of the presently described inventions lies in the rapid identification of 
transcription factor target genes of both a direct (i.e. binds the factor) and indirect (factor is recruited 
to the gene through other proteins) nature from a living cell line or tissue. Application of the 
presently described invention allows for the vast identification of target loci for virtually any 
transcription factor of either a DNA binding or nonDNA binding nature. It is accomplished through 
a standard fixation of chromatin in living material, such as cells in tissue culture or isolated tissues, 
followed by successive immunoprecipitations of extracted protein/DNA complexes with antibodies 
specific to both transcription factors of interest as well as antibodies specific to the proteins of the 
core transcriptional machinery. Typically, DNA isolated by these methodologies may then be 
subjected to various molecular biology procedures such as EPCR, cloning into exon-trapping vectors 
and/or screening against cDNA libraries or microarray s of known genes to determine the content of 
actively transcribed genes pulled down with antibodies against chosen transcription factors. 

Antibodies contemplated by the present invention are utilized for the purposes of 
immunoprecipitating either DNA binding or nonDNA binding proteins and may be of monoclonal or 
polyclonal origin. These antibodies described herein are designed against full length proteins as well 
as against particular epitope amino acid subsets present within those proteins. The antibodies are of 
rabbit and goat origin, but may be produced through the immunization of any of a number of 
organisms typically used for research antibody production. 

The solid phase technology contemplated by the present invention involves the use of 
magnetic beads. These beads are conjugated to antibodies which specifically recognize particular 
proteins recovered from living cells and tissues. The magnetic aspect of the bead allows for efficient 
separation of the bead/antibody/protein/DNA complex from nonspecific materials, including wash 
solutions, present in the reaction mixture. Other solid phase technologies contemplated by the 
present invention include sepharose or other solid matrices linked to protein" Ay protein G or directly 
conjugated to antibodies which recognize specifically chosen proteins present within living 
cells/tissues. 
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For the purposes of the present invention the act of irnmunoprecipitating a protein/DNA 
complex will involve the utilization of an antibody of either polyclonal or monoclonal origin to 
directly and specifically recognize, bind and extract a protein/DNA complex from a bulk population 
of cross-linked protein/DNA complexes. It is this immunoprecipative process which allows for the 
efficient isolation and ultimate characterization of transcription factor target genes. 

Molecular biology procedures described in the present invention include use of the collection 
of DNA fragments potentially containing transcription factor target genes recovered after 
immunoprecipitation to screen cDNA and/or genomic libraries. Additional molecular biology 
procedures include cloning the collection of DNA fragments potentially containing transcription 
factor target sequences into bacteriophage arms or plasmids for efficient screening and or 
sequencing. 

For purposes of the present invention the term "gene" will refer to any and all regions of the 
genome of all organisms which code for proteins. This definition will also include all control 
elements directly or indirectly associated with controlling the production of mRNA from the gene. 

In addition, for the purposes of the present invention the term "control element" will refer to 
any regulatory element which dictates, controls or modulates the production of mRNA from the 
corresponding gene. The production of mRNA is presumed to occur, at least in part, through the 
binding of transcription factors. 

For the purposes of the present invention the term "transcription factor" will refer to any 
protein which binds directly or indirectly to a control element present within a gene and dictates, 
controls or modulates either the production or inhibition of production of mRNA from that particular 
gene. 

As well, for the purposes of the present invention the term "transcriptional activator" will 
refer to any protein which binds either directly to a DNA control element or indirectly to a DNA 
control- element through other proteins and activates or drives the production^of^mRNA from the - 
gene corresponding to that particular control element. 
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For the purposes of the present invention the term "transcriptional repressor" will pertain to 
any protein which actively downregulates and thereby represses the production of mRNA from a 
gene to levels below those naturally occurring in an in vivo setting or to undetectable levels. 

Also for the purposes of the present invention, the term "transcriptional modulator" will refer 
to any protein which dictates, controls or modulates the production of mRNA from a gene. 

A gene will be delineated as active and therefore "expressed" when a nucleotide sequence 
referred to as an activating element is present within the gene or in close proximity to the gene and 
drives the production of detectable levels of mRNA, presumably through the actions of a 
transcriptional activating factor or transcriptional modulator. A gene will be delineated as not 
expressed when mRNA cannot be detected, presumably due to the absence of control activating 
elements, due to the absence of transcriptional activators present on those elements or due to the 
presence of transcriptional repressors. 

Finally, for the purposes of the present invention the term "active repression" will refer to the 
direct downregulation of a gene due to the presence of a silencing element within that gene or in 
close proximity to the gene, presumably through the binding at that particular silencing element or 
negative regulatory element of a transcriptional repressor. 

5.1 Trans criptional Regulation and Human Physiology 

With the recent enormous influx of genomic information into the scientific community 
inevitably comes questions about genetic hierarchies and ultimately gene function. How gene 
activity is regulated and in what context is as crucial to an understanding of our genetic makeup as 
the sequence itself. More importantly, the question "What genes are expressed or repressed with 
respect to physiology?" represents an important concern regarding the discovery and 
characterization of drug targets. It is clear that the regulation of transcription plays a critical role in a 
Umitless array of physiological processes. For example, a number of transcription factors have 
previously been implicated as either protooncogenes or tumor suppressors, thus affecting cancer 
progression by promotion or inhibition of cellular growth and apoptosis (for review see Levine e t aT7 
Natum, 1991, 351: 453-456). The transcription factor p53 has been shown to play an indispensable 
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role in the suppression of tumorigenesis and thus has become to be known as a tumor suppressor in 

its wild-type form (Seto et al., Proc. Natl. Acad. Sci. USA . 1992, 89: 12028-12032). The statistical 

predisposition to tumorigenesis correlating with mutations in p53 is staggering, with for example, 

approximately 75-80% of all colon carcinomas studied exhibiting a loss of both p53 alleles. Such a 

preponderance for cancer upon inactivation of p53 DNA binding function strongly suggests that 

downstream targets for p53 transcriptional control may potentially play a role in tumor suppression 

and represent potential avenues of therapeutic intervention. Indeed, several of the targets for direct 

regulation by p53 have been demonstrated to be involved in the arrest or downregulation of cell 

proliferation and/or cell death. Examples of these include mdm2 (Oliner et al., Nature . 1993, 362: 

857-860), p21/WAFl (El-Deiry et al., Cell, 1993, 75: 817-825), hsp70 (Maehara et al., Oncology . 

2000, 58: 144-151), cyclin E (Smith et al., Exp. Cell Res .. 1997, 230: 61-68) and MDR1 (Achanzar 

et al., Toxicol, Appi Pharmacol., 2000, 64: 291-330). As p53 binding sites have been mapped in 

each of these loci, excellent internal controls exist for monitoring the sensitivity and background 

issues critical to the success of the technology described herein. 



The p53 DNA recognition site consists of a dimer of two ten-mers which exists very rarely 
within the mammalian genome, occurring only around 300 times in a genome of three billion 
nucleotides (El-Deiry et al., Nat. Genet .. 1992, 1(1): 45-49). This rare occurrence of the regulatory 
site fox p53 provides a valuable assessment of the efficiency of the technology presented described 
technology. Sequence information acquired from fragments immunoprecipitated can be scanned for 
the presence of this site and direct targets immediately identified while background is 
simultaneously assessed. 

In addition to p53, other factors have also been implicated in the progression of cancer. The 
female sex steroid hormone, estrogen, is required for the development and progression of human 
breast cancer. To understand how endocrine therapy works and why tumors may become resistant to 
one therapy but not another, an understanding of the molecular mechanisms of estrogen receptor 
(ER) function and identification of molecular targets for ER are required. The ER is a nuclear 
protein that functions as a transcription factor to regulate expression of estrogen responsive genes 
(Tenbaum et al.. Int. J. Biochem. Cell Biol 1997, 29: 1325-1341). Some of these estrogen- 
regulated genes mediate growth and development of the mammary glands, and it is apparent that 
many are important for the effects of estrogen on tumor cell proliferation. After estrogen or an 

12 



SUBSTITUTE SHEET (RULE 26) 



WO 02/14550 PCT/US01/24823 
estrogen analog binds to ER, dimerization of the receptor is induced which then allows binding of 

the complex to estrogen-responsive elements (ERE), a region in the promoter of estrogen target 

genes. The binding of the ER dimer to this promoter region then facilitates transcription of that gene. 

Most endocrine therapies for breast cancer inhibit tumor formation by depriving the cell of estrogen 

or by blocking its receptor. Synthetic drugs like tamoxifen were first called antiestrogens because 

they bind ER and competitively block the effects of estrogen on tumor cell proliferation and on 

expression of certain genes. However, it is not surprising that administration of this drug can have a 

spectrum of effects, depending on species, tissue, cell or gene context (Kazelenellenbogen et al., 

Breast Can cer Res. Treat .. 1997, 44: 23-38.). In some cases, these "antiestrogens" can be estrogenic, 

stimulating transcription of genes which may change cellular morphology. For instance, tamoxifen, 

which works as an antagonist to ER in breast cancer cells, can induce tumor development in the 

uterus (Deligdisch, L., Mod. Pathol . T 1993, 6(1): 94-106). In other cases, sometimes in the same 

cell, they have predominant antiestrogenic activity. These data provide the rationale for the 

identification of patterns for ER gene targets which can be activated and/or repressed upon the 

variety of drug treatments in different organs or tissues. Those molecular targets could potentially 

be used as important tumor markers and/or could provide additional indispensable information on 

hormonal responsiveness and further therapeutic intervention. 

Determination of cell fate and regulation of terminal differentiation by transcription factors 
represent major roles for these regulatory proteins in regulating physiology. The ability to generate 
mature lymphocytic cells in tissue culture, for example, has been of intense interest for a number of 
years as the potential for replenishing low T and B cell counts in patients undergoing chemotherapy 
or are HTV positive, for example, becomes a reality. The ikaros family of transcription factors has 
been shown to promote the differentiation of hematopoietic stem cells into the mature B and T cell 
lineages (Nichogiannopoulou et al., SermnJtamunol., 1998, 10: 119-125). Correspondingly, mice 
which possess a mutation in the conserved DNA binding domain of the ikaros locus fail to possess B 
and T lymphocytes as well as the earliest progenitors of these lineages (Winandy et al., Cell . 1995, 
83: 289-299). Thus, the ability to deteimine the downstream targets for ikaros allows for the 
potential to identify genes which promote hematopoietic stem cell differentiation and hence B and T 
cell production. The DNA recognition sequence for the ikaros family has been previously 
characterized (Molhar et al., Mol. Cell Biol :: 1999, 14r 8292-8303)rthus locodentifiedthrough the" -- 
technology described herein as potential targets can be scanned for this recognition sequence as a 
confirmation of interaction. 
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Other more organogenic effects on human physiology are also controlled by transcription 
factors. Cardiac hypertrophy, or enlargement of the heart, is the result of attempts by the 
cardiovascular system to compensate for progression of many forms of cardiac disease, including 
hypertension, mechanical load, heart attack (myocardial infarction) and others (for review see 
McKinsey et al., Curr. Opin. Genet Dev .. 1999, 9: 267-274). At the molecular level, external stress 
factors such as hypertension and myocardial infarction result in a reactivation of the fetal cardiac 
genetic program, as well as a general physiological enlargement of the myocardium through 
increased myocardial cell size. A number of transcription factors have been suggested to be 
involved in the initiation and maintenance of the reactivation of fetal cardiac genes. GATA4, a 
member of the GATA family of transcription factors, is involved in the upregulation of several fetal 
cardiac (Herzig et al., Proc. Natl. Acad. Sci. USA . 1997, 94: 7543-7348). Studies of GATA4 and 
other factors involved in response to cardiac stress will reveal novel cascades of genes representing 
potential targets for therapeutic prevention and/or intervention of enlargement of the heart. 

In addition, there are a number of human genetic disorders affecting both growth and 
reproductive capacity. Several of these include mutations in transcription factors which have 
previously been shown to play vital roles in neuroendocrine organogenesis during embryonic 
development as well as appropriate functioning of this system in the adult (for review see Treier et 
al., Genes Dev ., 1998, 12: 1691-1704). Defects in human growth and fertility represent a major 
concern among the world population. Many of the problems relating to these phenomena arise due 
to misregulation of genes which play crucial roles in the neuroendocrine system. Progress in 
understanding this complex field has been aided by the fact that several murine animal models have 
been shown to exhibit phenotypes strikingly similar to that demonstrated by allelic mutations in 
humans. For example, mutations in the human Prop-1 locus result in familial combined pituitary 
hormone deficiency, a finding quite similar to that found in the naturally occurring Ames mouse 
mutant (Wu et al., Nat. Genet., 1998, 6: 1143-1152). As well, both the Snell and Jackson dwarf 
mice have been shown to contain mutations within the Pit-1 locus (Rhodes et al., Curr. Opin. Genet. 
Dev., 1994, 4: 709-717). A number of human dwarfism cases which display similar pituitary 
lineage loss have now been demonstrated to carry mutations in the same locus (Wu et al., Nat. 
- -Genet., 1998, 6: 143-1152). • .. . 
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Both the Prop-1 and Pit-1 genes are POU domain-containing homeobox transcription factors 
which act at distinct temporal and spatial points within the development of the pituitary gland. 
Studies on the Ames dwarf have suggested that Prop-1 acts upstream of Pit-1 in the developmental 
regulatory cascade, putatively setting up a rudimentary organ from which Pit-1 is able to guide 
lineage determination and differentiation (Dasen et al., Cell . 1999, 97 : 587-598). Indeed, Pit-1 has 
been shown by a number of groups to play an indispensable role in the survival and terminal 
differentiation of the somatotrope, lactotrope and thyrotrope pituitary cell lineages (Rhodes et al., 
Curr. Qpin. Genet. Dev., 1994, 4: 709-717). In the absence of these cell populations, specifically 
that of the somatotrophic lineage, dwarfism and other growth defects occur (Treier et al., Genes 
Dev., 1998, 12: 1691-1704). Many of the mutations which have been characterized in humans as 
well as other organisms for these factors lie in the DNA binding domain, which strongly suggests 
that an inability to effectively bind to and thus regulate downstream target genes is directly involved 
in the growth and fertility defects observed. An application of the technology described herein to 
identify and characterize both direct and indirect targets for these factors will undoubtedly reveal 
novel pathways for potential therapeutic intervention for both growth and fertility defects in humans. 

It is evident that the activation or repression of gene activity is essential for the appropriate 
development, growth and viability of an organism. An understanding of the transcription factors 
described above as well as many others that govern these processes and specifically the 
identification of which target genes are controlled by these factors in both temporal and spatial 
manners during embryonic development and throughout adulthood is crucial to understanding 
various phenotypic characteristics. Current technologies such as subtractive hybridization (Lockyer 
et al., Parasitology , 2000, 120: 399-407), differential display (Neilsonet al, Genomics, 2000, 3: 
13-24) and SAGE (Stephan et al.. Mol. Gen. Me.fah 2000, 70: 10-18), while effective at identifying 
target genes, are generally time consuming and do not implicitly arrive at direct transcription factor 
targets. In addition, they require cell lines or tissues to differentially express the factor being 
studied, a task not often easily achieved. Other methods such as protein/DNA affinity purification 
may deduce enhancer binding sites from the genome but lack the ability to reveal the gene regulated 
by the enhancer, thus requiring positional cloning of exonic coding sequences (Solomon et al., Cell . 
1988, 53: 937-947). Recent progress has been made, however, in the identification transcription 
factor targets and their •corresponding coding sequences in Vivo through the infection of living cells 
with modified retroviruses which seek out genomic transcription factor binding sites via 
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integrase/transcription factor fusion proteins incorporated into the viral particle and "trap" exons 
(Burgess et al., U.S. Patent #6,139,833, Issued October 31, 2000). 



By studying the intricate outline of terminal target genes regulated by various transcription 
factors rather than the factors themselves, it will be possible not only to discover novel therapeutic 
targets, but also to efficiently focus drug delivery to discrete physiologic gene products thus 
enhancing effectiveness and reducing or eliminating side effects. Moreover, the genes discovered as 
regulated by transcription factors may be used in a microarray format as phenotypical markers for 
medical diagnostics. 

5.2 Modified Ch romosomal Immunoprecipitation 

In order to identify and characterize direct molecular targets for regulation by specific 
transcription factors, it is necessary to employ technologies which take advantage of the extremely 
specific protein/DNA contacts involved in gene regulation which are maintained within intact cells 
or tissues. The Chromosomal Immunoprecipitation (ChIP) assay has been well established and may 
be successfully performed by those skilled in the art (Solomon et al., Cell . 1988, 53: 937-947; de 
Belle et al., Biotechniques, 2000, 29(19): 170-175). It allows for manipulation of the above 
mentioned inherent physical interactions between proteins and DNA to delineate known downstream 
targets for virtually any transcription factor. This method is based on the ability of formaldehyde or 
other chemicals to produce DNA/protein, RNA/protein and protein/protein cross-links at 2 angstrom 
resolution in vivo within intact cells or tissues. Addition of formaldehyde to living cells results in 
formation of an extensively cross-linked network of biopolymers, thus preventing any large-scale 
redistribution of cellular components. Formaldehyde does not react with free double-stranded DNA, 
avoiding kinetic constraints due to DNA damage. In addition, formaldehyde crosslinks can be 
reversed under mild conditions so that DNA, RNA and protein complexes can be further analyzed 
separately. Figure 2 illustrates the chemistry behind the crosslinkage method. The experimental 
design originates from the pioneering work of Alexander Varshavsky who developed the chromatin 
fixation, purification and immunoprecipitation scheme for analyzing the distribution of histones in 
the Drosophila heat-shock gene promoter (Solomon et al., Cell, 1988, 53: 937-947). Upon reversal 
of crosslinkage and mechanical shearing of cellular DNA, protein/DNA"interactions can be assessed " 
by utilizing sequence information of known target loci in combination with the Polymerase Chain 
Reaction (PCR) (Lams et al., Academic Press, 1990; McPherson et al., IRL Press, 1991; Erlich, A. 
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Stockton Press, 1989). Recent work has demonstrated that the ChIP assay can be applied to the 

study of virtually any transcription factor which comes into contact, either directly or indirecdy, with 

DNA (Scully et al., Science . 2000, 290(5494): 1127-31; Jepsen et al., Cell, 2000, 102(6):753-63). 

Living cells and/or isolated tissues are fixed with formaldehyde by adding cross-linking 
agent directly to the cell growth medium or tissue. Although the presently described invention 
utilizes salivary glands from Sciara coprophila for RNA Polymerase II and Hela cells for p53 it is 
in no way limited to these particular tissues and cell types. Other tissues from other organisms and 
species include, but are not limited to heart, brain, spleen, lung, liver, muscle, kidney, testis, ovary, 
gut, hypothalamus, pituitary, tooth bud, mesoderm, ectoderm, endoderm, neural tube, somite, 
smooth muscle, cardiac muscle, skeletal muscle and all embryonic tissues from all possible 
timepoints. Cell lines from which transcription factor target genes may be discovered via 
methodologies provided by the presently described invention include, but are in no way limited to 
13C4 (mouse/mouse, hybrid, hybridoma), 143 B (human, bone, osteosarcoma), 2 BD4 E4 K99 
(mouse/mouse, hybrid, hybridoma), 3 C9-D11-H11 (mouse/mouse, hybrid, hybridoma), 3 E 1 
(mouse/mouse, hybrid, hybridoma), 34-5-8 S (mouse/mouse, hybrid, hybridoma), 3T3 (mouse, 
Swiss albino, embryo), 3T3 LI (mouse, Swiss albino, embryo), 3T6 (mouse, Swiss albino, embryo), 
5 C 9 (mouse/mouse, hybrid, hybridoma), 5G3 (hybrid, hybridoma), 6-23 (clone 6) (rat, thyroid, 
medullary, carcinoma), 7 D4 (mouse/rat, hybrid, hybridoma), 72 A 1 (mouse/mouse, hybrid, 
hybridoma), 74-11-10 (mouse/mouse, hybrid, hybridoma), 74-12-4 (mouse/mouse, hybrid, 
hybridoma), 74-22-15 (mouse/mouse, hybrid, hybridoma), 74-9-3 (mouse/mouse, hybrid, B cells x 
myeloma, hybridoma, B cell), 76-7-4 (mouse/mouse, hybrid, hybridoma), 7C2C5C12 
(mouse/mouse, hybrid B cells x myeloma, hybridoma), 9 BG 5 (mouse/mouse, hybrid, hybridoma), 
9-4-3 (mouse/mouse, hybrid, hybridoma), A 172 (human, glioblastoma), A 375 (human, malignant 
melanoma), A 72 (dog, golden retriever, connective, not defined tumor), A-427 (human, Caucasian, 
lung, carcinoma), A-498 (human, kidney, carcinoma), A-704 (human, kidney, adenocarcinoma), 
A549 (human, lung, carcinoma), ACHN (human, Caucasian, kidney, adenocarcinoma), ACT 1 
(mouse/mouse, hybrid, hybridoma), AE-1 (mouse/mouse, hybrid, hybridoma), AE-2 (mouse/mouse, 
hybrid, hybridoma), Aedes albopictus (mosquito - Aedes albopictus, larvae), AGS (human, 
Caucasian, stomach, adenocarcinoma), AK-D (cat, lung, embryonic), Amdur H (human, Caucasian, 
-s3rin,-fibroblast, methylmalonicacidemia), AV 3 (human, amnion)rB 95: 8tmonkey; marmoset, -- 
leukocyte), B-63 (mouse, mammary gland, carcinoma), B2-1 (mouse, BALB/c, embryo), B50 (rat, 
nervous system, nervous tissue glial tumor), B69 (mouse/mouse, hybrid, hybridoma), B95a 
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(monkey, marmoset), BAE (bovine, aorta), BALB 3T12-3 (mouse, BALB/c, embryo), BALB 3T3 
clone A31 (mouse, BALB/c, embryo), BB (fish - Ictalurus nebulosus (bullhead brown catfish), 
trunk), BBM.1 clone E9 (mouse/mouse, hybrid, hybridoma), BC3H1 (mouse', brain, brain tumor), 
BCE C/D-lb (bovine, cornea), BeWo (human, placenta, choriocarcinoma), BF-2 (fish - bluegill fry, 
caudal trunk), BGM (monkey, African green, kidney), BHK 21 clone 13 (hamster, golden Syrian, 
kidney), BNL CL.2 (mouse, BALB/c, liver, embryonic), BNL SV A.8 (mouse, liver, embryonic), 
BS/BEK (bovine, kidney, embryonic), BSC-1 (monkey, African green, kidney), BT (bovine, 
turbinate), Bu (IMR-31) (buffalo, lung), BUD-8 (human, Caucasian, skin, fibroblast), BXPC-3 
(human, pancreas, adenocarcinoma), C 1271 (mouse, RHI, mammary gland, mammary tumor), 
C2C12 (mouse, muscle), C32 (human, melanoma, amelanotic), C6 (rat, glial tumor), Caco-2 
(human, Caucasian, colon, adenocarcinoma), Caki-1 (human, Caucasian, kidney, carcinoma), Caki-2 
(human, Caucasian, kidney, carcinoma), CaLu-1 (human, Caucasian, lung, carcinoma, epidermoid), 
Calu-3 (human, Caucasian, lung, adenocarcinoma), CAP AN 1 (human, Caucasian, pancreas, 
adenocarcinoma), CAPAN 2 (human, Caucasian, pancreas, carcinoma), CAR (fish - goldfish, fin), 
CCF-STTG1 (human, Caucasian, astrocytoma, anaplastic, grade IV), CCRF S 180 II (mouse, CFW, 
sarcoma), CCRF-CEM (human, Caucasian, peripheral blood, leukemia, acute lymphoblastic), 
CCRF-SB (human, Caucasian, peripheral blood, leukemia, acute lymphoblastic), CEM/C2 (human, 
leukemia, T cell), Cf2Th (dog, thymus), Chang liver (human, liver), CHO Kl (hamster, Chinese, 
ovary), CHP 3 (human, Black, skin, fibroblast, galactosemia), CHP 4 (human, Black, skin, 
fibroblast, asymptomatic galactosemia), CHSE 214 (fish - salmon, embryo), Clone l-5c-4 WKD of 
Chang Conjunctiva (human, conjunctiva), Clone M-3 (mouse, (CxDBA) Fl , skin, melanoma), CMT 
93 (mouse, C57BL/ICRFat, rectum, carcinoma), COS-1 (monkey, African green, kidney), COS-7 
(monkey, African green, kidney), CPA (bovine, endothehum, pulmonary artery), CPA 47 (bovine, 
endothelium, pulmonary artery), CPAE (bovine, endothelium, pulmonary artery), CRFK (cat, 
domestic, kidney), CRI-D11 (rat, NEDH, insulinoma), CSE 119 (fish - salmon, embryo), CV 1 
(monkey, African green, kidney), CVC 7 (Agrothis segetum, hybrid, hybridoma), D 17 (dog, bone, 
sarcoma, osteogenic), Daudi (human, Black, lymphoma, Burkitt), DB 9 G.8 (mouse/mouse, hybrid, 
hybridoma), DBl-Tes (dolphin, Delphinus bairdi, testis), DeDe (hamster, Chinese, lung), Detroit 
510 (human, Caucasian, skin, fibroblast, galactosemia), Detroit 525 (human, Caucasian, skin, 
fibroblast, Turner syndrome), Detroit 529 (human, Caucasian, skin, fibroblast, trisomy 21 / Down 
syndrome); Detroit-532 (humanreaucasianrforeskin; -trisomy "21 1 Down syndrome), Detroit' 539 — — — - 
(human, Caucasian, skin, fibroblast, trisomy 21 / Down syndrome), Detroit 548 (human, Caucasian, 
skin, fibroblast, partial D trisomy), Detroit 550 (human, skin, fibroblast), Detroit 551 (human, 
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Caucasian, skin, embryonic), Detroit 562 (human, Caucasian, pharynx, carcinoma), Detroit 573 

(human, Caucasian, skin, fibroblast, B/D translocation), Detroit 6 (human, bone marrow), DK (dog, 
beagle, kidney), DON (hamster, Chinese, lung), DU 145 (human, Caucasian, prostate, carcinoma), 
Duck embryo (duck, Pekin, embryo), EDerm (horse, dermis), EBTr (bovine, trachea, embryonic), 
ECTC (bovine, thyroid, embryonic), ECV304 (human, Asiatic, umbilical cord), EIAV 12E8.1 
(mouse/mouse, hybrid, hybridoma), Ep 16 (mouse/mouse, hybrid, hybridoma), EPC (fish, carp 
epidermal, epithelioma), EREp (rabbit, skin, embryonic), ESK-4 (pig, kidney, embryonic), FBHE 
(bovine, heart, embryonic), Fc 2 Lu (cat, lung, embryonic), Fc 3 Tg (cat, tongue, embryonic), FeLV 
3281 (cat, lymphoma), FHM (fish - minnow, skin), FL (human, amnion), FRhK-4 (monkey, rhesus, 
kidney, embryonic), G-7 (mouse, Swiss-Webster, muscle), G.8 (mouse, Swiss- Webster, muscle), 
GCT (human, lung, metastasis, histiocytoma), GH 1 (rat, Wistar-Furth, pituitary tumor), GH 3 (rat, 
Wistar-Furth, pituitary tumor), Girardi heart (human, heart), GK 1.5 (mouse/rat, hybrid, hybridoma), 
H 16-L10-4R 5 (mouse/mouse, hybrid, hybridoma), H 9 (human, leukemia, acute lymphoblastic), H- 
4-II-E (rat, liver, hepatoma), H4 (human, Caucasian, brain, nervous tissue glial tumor), H4-II-E-C3 
(rat, AxC, liver, hepatoma), H4TG (rat, liver, hepatoma), H9c2(2-1) (rat, BDDC, heart), Hak 
(hamster, Syrian, kidney), HCT 116 (human, colon, carcinoma), HCT-8 (human, intestine, ileocecal, 
adenocarcinoma), HEL 299 (human, Caucasian, lung, embryonic), HeLa (human, Black, cervix, 
carcinoma, epitheloid), HeLa 229 (human, Black, cervix, carcinoma, epitheloid), HeLa S 3 (human, 
Black, cervix, carcinoma, epitheloid), Hep 2 (human, Caucasian, larynx, carcinoma, epidermoid), 
Hep 3B2.1-7 (human, liver, carcinoma, hepatocellular), Hep G2 (human, Caucasian, liver, 
carcinoma, hepatocellular), Hepa 1-6 (mouse, liver, hepatoma), HFL (human, lung), HG 261 
(human, Caucasian, skin, fibroblast, Fanconi anemia), HGF 24 (human, gingival stroma), HL 60 
(human, Caucasian, peripheral blood, leukemia), HOS (human, Caucasian, bone, osteosarcoma), 
HRT 18 (human, rectum-anus, adenocarcinoma), Hs 683 (human, neuroglia, glioma), Hs 863 .T 
(human, bone, sarcoma, Ewing's), HS 883.T (human, bone, giant cell, sarcoma), HS 888 Lu (human, 
Caucasian, lung), Hs-27 (human, foreskin), HSDM1C1 (mouse, Swiss albino, fibrosarcoma), HT 
1080 (human, Caucasian, acetabulum, fibrosarcoma), HT 1376 (human, Caucasian, bladder, 
carcinoma), HT-29 (human, Caucasian, colon, adenocarcinoma), HuTu 80 (human, 
adenocarcinoma), 1 10 (mouse, BALB/cJ, testis, Leydig cells, testicular tumor), IB-RS-2 (pig, 
kidney), IBRS-2 D10 (pig, kidney), IEC-6 (rat, intestine, small), IM-9 (human, Caucasian, bone 

- marrow- multiple-myeloma);-IMR"31 Bu (buffalorlung)rIMR 32 (humanrCaucasianT ----- 

neuroblastoma), IMR-90 (human, Caucasian, lung, embryonic), Intestine 407 (human, Caucasian, 
intestine, embryonic), Jill (human, leukemia, monocytic), J 774A.1 (mouse, BALB/c, monocyte- 
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macrophage, not defined tumor), Jensen sarcoma (rat, sarcoma), JH 4 clone 1 (guinea pig, strain 13, 
lung), Jiyoye (human, Black, ascitic fluid, lymphoma, Burkitt), JM (human, leukemia, T cell), Jurkat 
J6 (human, leukemia, T cell), K 562 (human, Caucasian, pleural effusion, leukemia, chronic 
myeloid), KATO m° (human, Mongoloid, stomach, carcinoma), KB (human, Caucasian, mouth, 
carcinoma, squamous cell), KHOS/NP (human, Caucasian, bone, osteosarcoma), KMP (mouse), L 
1210 (mouse, ascitic fluid, leukemia, lymphocytic), L 132 (human, lung, embryonic), L 21.6 (mouse, 
hybrid, hybridoma), L 243 (mouse/mouse, hybrid, hybridoma), L 5.1 (mouse/mouse, hybrid, 
hybridoma), L 929 (mouse, C3H/An, connective), L6 (rat, skeletal muscle), LC 540 (rat, Fisher, 
testis, Leydig cells, testicular tumor), LLC-MK2 (monkey, rhesus, kidney), LLC-PK1 (pig, kidney), 
LLC-RK1 (rabbit, New Zealand white, kidney), LLC-WRC 256 (rat, Walker, carcinoma), LM from 
NCTC clone 929 (mouse, C3H/An, connective), LM TK negative (mouse, C3H/An, connective), 
LNCaP.FGC (human, Caucasian, prostate, carcinoma), LS 1 80 (human, Caucasian, colon, 
adenocarcinoma), M 1 (mouse, SL, bone marrow, leukemia, myeloid), M-2E6 (mouse/mouse, 
hybrid, hybridoma), M2-1C6-4R3 (mouse/mouse, hybrid, hybridoma), MA 104 (monkey, African 
green, kidney, embryonic), mAB 35 (mouse/rat, hybrid, B cells x myeloma, hybridoma, B cell), 
MARC 145 (monkey, kidney), Mc Coy (mouse), MC/CAR (human, plasmacytoma, B cell), MCF 7 
(human, Caucasian, breast, adenocarcinoma), MDBK (bovine, kidney), MDBK(BU 100) (bovine, 
kidney), MDCC MSB1 (chicken, avian, spleen, lymphoma), MDCK (dog, cocker spaniel, kidney), 
MDOK (sheep, kidney), MDTC RP 19 (turkey, lymphocyte, Marek's disease), MEL m (monkey, 
rhesus, mammary gland, mammary tumor), MG-63 (human, bone, osteosarcoma), MH 1 C 1 (rat, 
buffalo, liver, hepatoma), MH-S (mouse, lung), MIA PaCa-2 (human, Caucasian, pancreas, 
carcinoma), MiCll (mustela vison (mink), lung), MK-D6 (mouse/mouse, hybrid, hybridoma), MLA 
144 (gibbon, lymphosarcoma), MOLT-3 (human, peripheral blood, leukemia, acute lymphoblastic T 
cell), MOLT-4 (human, peripheral blood, leukemia), MPC-11 (mouse, BALB/c, myeloma), MPK 
(minipig, kidney), MRC 5 (human, lung, embryonic), MRSS-1 (mouse/mouse, hybrid, hybridoma, B 
cell), MS (monkey), Mv 1 Lu (mustela vison (mink), lung), MVPK-1 (pig, kidney), NA C 1300 
clone (mouse, brain, neuroblastoma), Namalwa (human, Black, lymphoma, Burkitt), NCTC 2544 
(human, skin, keratinocyte), NCTC clone 3526 (monkey, rhesus, kidney), Neuro-2a (mouse, albino, 
neuroblastoma), NIH:OVCAR-3 (human, Caucasian, adenocarcinoma, ovary), NOR 10 (mouse, 
muscle), NRK 49F (rat, kidney), NSO (mouse, BALB/c, myeloma), OA1 (sheep, brain), OHH1.K 
(deer; kidney), OKT 3 (mouse/mouse r hybrid,-hybridoma), OKT 4 (mouse/mouse, hybrid, - - • 
hybridoma), OKT 8 (mouse/mouse, hybrid, hybridoma), P3HR1 (human, lymphoma, Burkitt), P3 
88 Dl (mouse, DBA/2, monocyte-macrophage, lymphoma), P3 NS1 Ag4 (mouse, myeloma), 
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P3NP/PFN (mouse/mouse, hybrid, hybridoma), P815 (mouse, mastocytoma), PANC-1 (human, 
Caucasian, pancreas, carcinoma), PC 61-5-3 (mouse/rat, hybrid, hybridoma), PC-12 (rat, adrenal 
medulla, pheochromocytoma), PD 5 (pig, kidney), PEG 1-6 (mouse/mouse, hybrid, B cells x 
myeloma, hybridoma, B cell), PK 15 (pig, kidney), PLC/PRF/5 (human, liver, hepatoma, Alexander 
cells), Pt Kl (marsupial - potoroo, kidney), QT 35 (quail, Japanese, fibrosarcoma), QT 6 (quail, 
Japanese, fibrosarcoma), R 2 C (rat, Wistar-Furth, testis, Leydig cells, testicular tumor), R 9 ab 
(rabbit, New Zealand white, lung), R D (human, Caucasian, muscle, rhabdomyosarcoma, 
embryonal), R63 (mouse/mouse, hybrid, B cells x myeloma, hybridoma, B cell), RAB-9 (rabbit, 
New Zealand white, skin, fibroblast), Raji (human, Black, lymphoma, Burkitt), RBL 1 (rat, 
leukemia, basophilic), RFL 6 (rat, Sprague-Dawley, lung), RK 13 (rabbit, kidney), RK 13/1 (rabbit, 
kidney), RPMI 1788 (human, Caucasian, peripheral blood), RPMI 1846 (hamster, golden Syrian, 
skin, melanoma, melanotic), RPMI 2650 (human, nasal septum, carcinoma, squamous cell), RPMI 
8226 (human, peripheral blood, myeloma), RR 1022 (rat, Amsterdam, sarcoma), RTG 2 (fish - trout, 
rainbow, gonad), RTO (fish - trout, rainbow, ovary), Saos-2 (human, Caucasian, bone, 
osteosarcoma), Sf 1 Ep (rabbit, domestic, epidermis), SIRC (rabbit, cornea), SK-LU-1 (human, 
Caucasian, lung, adenocarcinoma, grade III), SK-MES-1 (human, lung, carcinoma, squamous cell), 
SK-NEP-1 (human, Caucasian, kidney, Wilms' tumor), SK-OV-3 (human, Caucasian, ovary, 
adenocarcinoma), SSE 5 (fish - trout, embryo), STO (mouse, SIM, embryo), SV-T2 (mouse, 
BALB/c, embryo), SW 13 (human, Caucasian, adrenal cortex, adenocarcinoma), T 98 G (human, 
Caucasian, glioblastoma), Tb 1 Lu (bat, lung), TE 671 (human, Caucasian, medulloblastoma), TK 
TS 13 (hamster, Syrian, kidney), U 937 (human, Caucasian, pleural effusion, lymphoma, 
histiocytic), VERO (monkey, African green, kidney), VERO 76 (monkey, African green, kidney), 
VERO C 1008 (monkey, African green, kidney), WC 1 (fish, dermis, sarcoma), WF 2 (fish - Walley 
whole fry, fibroblast), WI 26 VA 4 (human, Caucasian, lung, embryonic), WI 38 (human, Caucasian, 
lung, embryonic), WI 38 VA 13 (human, Caucasian, lung, embryonic), WI-1003 (human, lung), 
WISH (human, amnion), WM 115 (human, skin, melanoma), XC (rat, Wistar, sarcoma), Y 1 
(mouse, LAF1, adrenal cortex, adrenal tumor), ZR-75-1 (human, Caucasian, breast, carcinoma) and 
any other as yet undiscovered or uncharacterized cell lines through which the presently described 
invention may be implemented for the discovery of transcription factor target genes. 

-Prehminary time course experiments-spanning"between 5 minutes and 1 hour of fixation- are- 
performed to yield the best combination of in vivo fixed chromatin, high DNA recovery, and small 
size of chromatin fragments. For specific purposes, the cross-Unking time can be considerably 
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reduced or prolonged and must be optimized for the particular tissue or cell line and the transcription 
factors being studied. Figure 2 illustrates the chemical cross-linking of DNA and proteins by 
formaldehyde. Formaldehyde (HCHO) is a very reactive dipolar compound in which the carbon 
atom is the nucleophilic center. Amino and imino groups of the proteins (e.g. the side chains of 
lysine and arginine) and of nucleic acids (e.g., cytosines) react with formaldehyde, leading to the 
formation of a Schiff base (reaction I). This intermediate can react with a second amino group ( 
reaction II) and condenses. Cross-links may be reversed by heating in Tris-HCL- containing buffers. 
This leads to a drop in pH and protonation of amino groups, thus forcing the equilibrium in the 
reverse direction. In Figure 2 (A) illustrates formaldehyde-mediated cross-linking between the side 
chains of the lysines and (B) depicts cross-hnking between cytosine and lysine. 

While the present invention employs formaldehyde as a chemical component for the cross- 
linking of protein/DNA complexes in living cells and tissues, it is in no way limited to this reagent 
for fixation. Other chemicals may also be utilized to fix proteins to DNA (Benashski et al., 
Methods , 2000, 22: 365-371). Some of these include, but are in no way limited to homobifunctional 
compounds difluoro-2,4-dinitrobenzene (DFDNB), dimethyl pimelimidate (DMP), disuccinimidyl 
suberate (DSS), mcarbodiimide reagent EDC, psoralens including 4,5',8-trimethylpsoralen, photo- 
activatable azides such as la I(S-[2-(4-azidosaMcylamido)emyll±io]-2-thiopyridine) otherwise known 
as AET, (N-[4-(p-axidosahcylamido)butyl]-3'[2'-pridy also known as APDP, 

the chemical cross-hnking reagent Ni(II)-NH2-Gly-Gly-His-COOH also known as Ni-GGH, 
sulfosuccinimidyl 2-[(4-axidosalicyl) amino]ethyl]-l,3-dithiopropionate) also known as SASD, (N- 
14-(2-hydroxybenzoyl)-N- 1 1 (4-azidobenzoyl)-9-oxo-8, 1 1 , 14-triaza-4,5-ditheatetradecanoate) and 
any as yet uncharacterized or undiscovered reagents which result in the cross-hnking of 
protein/DNA complexes in living cells and tissues. 

Upon fixation of protein/DNA complexes in intact cells or tissues cellular lysis is 
accomplished through standard protocols which may be successfully implemented by those skilled in 
the art (Solomon et al, Cell, 1988, 53: 937-947; de Belle et al., Biotechniques . 2000, 29(1): 170- 
175). For the purposes of chromosomal immunoprecipitation it is important that metal chelators 
such as EDTA and EGTA as well as protease inhibitors be added to the reaction to prevent 
- degradation of-protein/DNA-eomplexesr Tlie nurture is subsequentiy m 
the its passage through a 26G needle. Typically 4 rounds of 25-30 needle passages per round are 
necessary for sufficient lysis and chromatin fractionation. It is speculated that these parameters must 
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be defined for each tissue or cell type. Alternatively, samples may be lysed via the use of a Dounce 

homogenizer or the implementation of any mechanical stress which results in efficient breakage of 

cellular membranes and hence release of chromatin containing protein/DNA complexes. 

Fixed, lysed cells or tissues are subsequently subjected to high resolution mechanical 
shearing; i.e. sonication for the purposes of producing manageable DNA fragment sizes of the 
desired length. In general, the size of the DNA fragments may be critical for high-resolution 
mapping studies as well as the identification of transcription initiation sites and/or exonic sequences, 
thus sonication provides a convenient method to "customize" fragment length as illustrated in Figure 
4. The sizes observed are a typical result obtained with Hela cells cross-linked for 30 minutes and 
sonicated by routine use of the Branson model 250 sonifer with microtip at constant power for 
various amounts of time. While the presently described invention employs the use of a Branson 
model 250 sonifier/sonicator for the purposes of generating appropriate DNA fragment length from 
fixed lysed cells and/or tissues it is hypothesized that any mechanical instrument or enzymatic 
digestion capable of shearing or cutting soluble chromatin into lengths small enough to be 
manipulated via standard or modified molecular biology procedures for the purposes of discovering 
transcription factor targets may be utilized, These include, but are in no way limited to other 
sonicator models as well as restriction enzyme digestion by frequent as well as rare-cutting enzymes 
including, but in now way limited to, Acc I, Aci I, Acl I, Afe I, Afl H, Afl m Age I, Ahd I, Alu I, 
Alw I, AlwN I, Apa I, ApaL I, Apo I, Asc I, Ase I, Ava I, Ava il, Avr II, Bae I, BamH I, Ban I, Ban 

H, Bbs I, Bbv I, BbvC I, BceA I, Beg I, BciV I, Bel I, Bfa I, BfrB I, Bgl I, Bgl H, Blp I, Bmr I, Bpm 

I, BsaA I, BsaB I, BsaH I, Bsa I, BsaJ I, BsaW I, BsaX I, BseR I, Bsg I, BsiE I, BsiHKA I, BsiW I, 
Bsl I, BsmA I, BsmB I, BsmF I, Bsm I, BsoB I, Bspl286 I, BspD I, BspE I, BspH I, BspM I, BsrB I, 
BsrD I, BsrF I, BsrG I, Bsr I, BssH H, BssK I, BssS I, BstAP I, BstB I, BstE H, BstF5 I, BstN I, 
BstU I, BstX I, BstY I, BstZ17 I, Bsu36 1, Btg I, Btr I, Bts I, Cac8 I, Cla I, Dde I, Dpn I, Dpn H, Dra 
I, Dra m, Drd I, Eae I, Eag I, Ear I, Eci I, EcoN LEcoO109 1, EcoR I, EcoR V, Fau I, Fnu4H I, Fok 
I, Fse I, Fsp I, Hae H, Hae m, Hga I, Hha I, Hinc H, Hind m, Hinf I, HinPl I, Hpa I, Hpa H, Hpyl88 
I, Hpyl88 m, Hpy99 I, HpyCEMBI, HpyCEMIV, HpyCH4V, Hph I, Kas I, Kpn I, Mbo I, Mbo II, 
Mfe I, Mlu I, Mly I, Mnl I, Msc I, Mse I, Msl I, MspAl I, Msp I, Mwo I, Nae I, Nar I, Nci I, Nco I, 
Nde I, NgoM IV, Nhe I, Nla m, Nla IV, Not I, Nru I, Nsi I, Nsp I, Pac I, PaeR7 I, Pci I, PflF I, PflM 

-^-PlelrPme-IrPmHrPpu^ 

I, Sac II, Sal I, Sap I, Sau3A I, Sau96 1, Sbf I, Sea I, ScrF I, SexA I, SfaN I, Sfc I, Sfi I, Sfo, SgrA I, 
Sma I, Sml I, SnaB I, Spe, Sph I, Ssp I, Stu I, Sty I Swa I, Taq I, Tfi I, Tli I, Tse I, Tsp45 I, Tsp509 1, 
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T^TiSi 1 1, Xba I, Xcm I, Xho I, Xma I, Xmn I and any other as yet uncharacterized or 
undiscovered restriction endonucleases which may be utilized toxut DNA for the purposes of 
implementing the presently described invention to discover transcription factor target genes of both 
known and unknown origin. 

4 

It is the ability to customize the length of said DNA fragments which allows for the cloning 
of transcription factor targets upon immunoprecipitation utilizing solid phase sequential 
immunoprecipitation. After fixation and subsequent sonication, fixed chromatin fragments of 
defined length binding the protein (i.e. transcription factor) of interest either directly or indirectly are 
purified by selective immunoprecipitation with antibodies specific to 1) proteins present within the 
core transcriptional machinery, an example of which is the large subunit CO of RNA polymerase n 
and 2) the particular transcription factor for which target genes are being sought (see below for 
detailed description of this procedure). As discussed below, it is the solid phase sequential 
immunoprecipitation procedure utilizing antibodies to both the core transcriptional machinery 
proteins as well as specific transcription factors which allows for the efficient cloning and 
characterization of coding sequences for transcription factor target genes. 

S I Multiple B minds of Chromosorm ujmm^^ Ke.dn^ Back ground 

While it is clear that it is possible to obtain known in vivo target loci for numerous 
transcription factors utilizing conventional chromosomal immunoprecipitation technologies, an 
inherent problem is the retrieval of nonspecific protein/DNA complexes. These false positives are 
often the result of interactions between proteins and noncoding, inactive genomic DNA. While often 
relevant, these interactions may be those which occur at great distances from the transcription 
initiation site and thus the identification of coding sequence for the target loci pertaining to these 
protein/DNA contacts becomes difficult. The presently described technology circumvents the issues 
of nonspecificity and regulatory element distance from the transcription initiation site through an 
immunoprecipitation step utilizing antibodies to components of the basal transcriptional machinery. 
As outlined in Figure 5, chromatinized template is immunoprecipitated with antibodies specific for 
particular transcription factors. In order to enrich for loci actively regulated by these factors, the 

-■presentiydescribedni^^ 

antibodies specific for the large subunit (c) of RNA polymerase H (Background reduction step 1). 
This "prelP" immunoprecipitation enriches for genes actively transcribed by the Pol H transcription 
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machinery thereby reducing the nonspecificity of the secondary immunoprecipitation and helps to 

overcome problems related to higher complexity of the genome by omitting noncoding regions and 

satellite DNA together with nontranscribed genes, Said "prelP" is performed via the solid phase 

sequential chromosomal immunoprecipitation protocol described herein and may be successfully 

implemented by those known and skilled in the art. 

Figure 7 demonstrates the utility of the presently described invention as it pertains to 
chromosomal immunoprecipitation with antibodies specific for core chromatin proteins and 
antibodies specific for the large subunit of RNA polymerase II of Sciara coprophila (see Example 
6.1 for details and Weeks et al., Genes Dev .. 1993, 7: 2329-2344). It illustrates the necessity of 
cross-linkage reversal as well as the customizable capability of sonication for the purposes of 
producing chromatin fragments which can be immunoprecipitated discretely with respect to core 
chromatin proteins or core transcriptional apparatus proteins. IgG antibodies utilized and 
contemplated by the present invention include those specific for the Drosophila melanogasterKNA 
Polymerase II large subunit. These antibodies also cross react with the large subunit of RNA 
Polymerase E from the fly species Sciara coprophila. Species of origin for these antibodies is goat. 
Termed gAP a-Dl, the IgGs were affinity purified using a column carrying a fusion protein term 
Dl, which contains residues A519 - Gly992 of the He subunit. As well as cross-reacting with 
Sciara coprophila RNA Polymerase II, the antibodies mildly cross react with the large subunit of 
yeast as well as mammalian RNA Polymerase E (Weeks et al., Genes & Development. 1993, 7 : 
2329-2344). A 1 : 1000 dilution of the original stock solution of 22ug IgG in 50ul PBS was used. In 
addition, a second set of antibodies affinity purified from rabbit immunosera, termed rAP a - PCTD, 
recognizes the hyperphosphorylated C - terminal domain of Drosophila RNA Polymerase H. A 
dilution of 1:500 of an original stock solution of .054mg/ml in PBS/50%ethylene glycol was used. 
A third set of antibodies utilized in the presently described invention, termed gAP a-CTD, 
specifically recognizes the unphosphorylated C-terminal domain of Drosophila RNA polymerase H 
large subunit. A 1:2000 dilution of an original stock solution of .51mg/ml 2X PBS was used. 

Other antibodies contemplated by the present invention include those designed to discrete 
regions of the RNA Polymerase H individual subunits including He. These antibodies may be of 
eimer monoclonaTor polyclonal-origm- 

invention include rabbit affinity purified polyclonal antibody specific for a peptide mapping within 
the tandem repeat domain of the large subunit of murine RNA Polymerase H. An additional 

25 



SUBSTITUTE SHEET (RULE 26) 



WO 02/14550 PCT/USO 1/24823 

antibody contemplated by the present invention includes an affinity purified rabbit polyclonal 
antibody raised against a peptide mapping to the amino terminus of the large subunit of RNA 
Polymerase II. Yet a third antibody contemplated by the present invention includes a rabbit 
polyclonal antibody raised against a recombinant protein corresponding to amino acid 1-224 of RNA 
Polymerase II of human origin (for review see Tjian, R. and Maniatis, T., Cell . 1994, 77: 5-8). The 
presently described invention covers any antibodies designed to interact with or bind specifically the 
large subunit of RNA polymerase n. 

The presendy described invention is in no way limited to utilization of the above antibodies 
for purposes of first-round immunoprecipitation. Additionally, antibodies to other proteins and 
subunits present within the core basal transcriptional machinery may be utilized. It is contemplated 
by the present invention that sequential chromosomal immunoprecipitation utilizing antibodies to 
any protein present within the core transcriptional apparatus may substantially increase the ability to 
identify transcribed regions of transcription factor target loci (Kuras et al, Science . 2000, 19: 1244- 
1248). Subunits of the core transcriptional apparatus, specifically that of the transcriptional 
initiation complex, for which chromosomal immunoprecipitation may be successfully carried out as 
discussed in the presendy described invention include, but in no way are limited to species RNA 
polymerase HA, RNA polymerase IIB and RNA polymerase He. Other antibodies contemplated by 
the present invention may be designed to bind specifically to other core transcriptional apparatus 
proteins exclusive of the large subunit of RNA polymerase II (Nikolov et al., Proc. Natl. Acad. Sci. 
USA, 1997, 94: 15-22; Hoffmann et al., Proc. Natl. Acad. Sci. USA . 1997, 94: 8928-8935). These 
include, but in no way are limited to TAP, TAF(I110), TAF(I48), TAF(I63), TAFQI100), 
TAFOIllO), TAF(ni25), TAF(II135), TAF<TI145), TAF(II150), TAF(II170), TAF(II18), 
TAF(TI19), TAF(H20), TAF(II25), TAF(II250), TAF(II250Delta), TAF(II28), TAF(II30), 
TAFOBOalpha), TAF(II30beta), TAF(TI31), TAF(H40), TAFCH47), TAF(H55), TAF(H60, 
TAF(TI61), TAF(E67), TAF(II70-alpha), TAF(II70-beta), TAF(n70-garnma), TAF(H80), TAF-1, 
TAF-90, TAF-I, TAF-H, TAF-L, TBF1, TBP, TBP-1, TBP-2, TF1IA (32 kDa subunit, TFHA- 
alpha/beta precursor (major), TFIIA-alpha/beta precursor (minor), TFIIA-gamma, TPHA-L, TFIIA- 
S, TFHB, TFITJD, TFHE, TFIIE-alpha, TFHE-beta, TFHF, TFUF-alpha, TFIIF-beta, TFHH, TEQH 
core, TFIIH*,TFIIH-CAK, TFHH-CCL1, TFDH-cyclin H, TFHH-ERCC2/CAK, TFHH-KIN28, 
TKIH-MATI7 TFHH-p50, TFHH- — 

p55, TFHH-p62, TFTIH-p73, TFTIH-p80, TFHH-p85, TFIIH-p90, TFIIH-SSL2/RAD25, TFUH- 
TFHK, TFH-I, TFIUA and any other as yet uncharacterized or undiscovered proteins which interact 
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with the core transcriptional machinery for purposes of initiating transcriptional activation. It 

should be noted that utilization of antibodies to these proteins may produce conflicting results as 
evidence exists that genes may be repressed even in the presence of core transcriptional machinery 
proteins, with the exception of the dephosphorylated form of the large subunit (c) of RNA 
polymerase II (Tjian and Maniatis, Cell. 1994, 77: 5-8). As mentioned above, other as yet 
undiscovered and thus undescribed basal transcriptional apparatus proteins exclusive of the large 
subunit of RNA polymerase n are also contemplated by the present invention for the purposes of 
carrying out sequential immunoprecipitation to identify actively transcribed transcription factor 
target loci. 

Figure 8 demonstrates the utility of sequential immunoprecipitation for the purposes of 
identifying a known p53 target gene, p21. As is evidenced, very little quantitative PCR detection 
signal is lost due to sequential immunoprecipitation as compared to precipitation with antibodies 
only specific for the large subunit of RNA polymerase II (see the flowchart and lanes 1 through 4 
which represent different stages of the sequential immunoprecipitation procedure for details). As 
mentioned below, the presently described invention employs the use of a solid phase support, in this 
case magnetic beads, for increasing the yield of immunoprecipitated cross-linked chromatin during 
the implementation of sequential chromosomal immunoprecipitation. In order to provide a cross- 
linked protein/DNA complex as a substrate for the second round of immunoprecipitation with an 
antibody specific for the particular transcription factor being studied, it is necessary to release 
previously bound solid phase support from the protein/DNA complex. This is accomplished in the 
presently described invention via pH alteration. By increasing the acidity of the complex mixture 
antibodies linked to the solid phase are denatured and bound cross-linked DNA/protein complexes 
are released. In the experiment described the pH was adjusted from a neutral value of pH 7.6 to an 
acidity of 5.5 for the efficient denaturation of antibodies covalently linked to the solid phase. pH 
alteration may be performed to successfully denature antibodies on the solid phase by those known 
and skilled in the art, but must be determined experimentally for each particular antibody and solid 
phase support utilized for sequential immunoprecipitation. It is the denaturation of antibodies linked 
to solid phase which allows for the release of cross-linked pull-down DNA/protein complexes and 
the next round of chromosomal immunoprecipitation to be carried out and is hence covered by the 
present invention: Othennethods contemplated and-covered -by the presentrinvention-for the — — - 
denaturation of antibodies bound to solid phase supports for the purposes of sequential 
immunoprecipitation include, but are in no way limited to enzymatic digestion including but not 

27 



SUBSTITUTE SHEET (RULE 26) 



WO 02/14550 PCT/USO 1/24823 

limited to proteolysis, temperature alteration, chemical, mechanical and UV dissociation. In 

addition, it is contemplated by the present invention that the junction between the antibody and its 

solid support matrix may also be manipulated by the above methods for removal of chromatin 

template for the purposes of second round immunoprecipitation. 

Table 1 delineates the identification of two previously uncharacterized target regulatory 
elements for the transcription factor p53 discovered through utilization of technology described by 
the present invention. The nucleotide sequences listed demonstrate near consensus p53 binding sites 
and elicit a severalfold increase in stimulation in standard cotransfection induction experiments. 

It is clear that by performing sequential immunoprecipitation utilizing antibodies specific to 
the large subunit (c) of RNA polymerase II only actively transcribed transcription factor target genes 
will be identified due to the required clearance of the promoter prior to large subunit attachment. It 
is possible, however, to also discover and identify genes which are actively repressed by 
transcription factors which beacon repressor molecules that inhibit promoter clearance. An example 
of such a repressor is NcoR (Heinzel et al., Nature . 1997, 387: 43^8). It is contemplated in the 
presently described invention that utilization of antibodies specific for NcoR in combination with 
antibodies specific for factors which act to repress gene transcription that genes may be identified 
which are exclusively repressed for a variety of transcription factors. Other repressor proteins 
thought to be recruited by DNA binding transcriptional repressors contemplated by the present 
invention and which may be utilized as targets for sequential immunoprecipitation include, but are in 
no way limited to SMRT, SunCoR, FunCoR, SIN1, Sin3A (1), Sin3A (2), Sin3A (3), Sin3B, HP1 
and PcG (polycomb group proteins). In addition, proteins which bind selectively to methylated 
DNA are speculated to be involved in mediating or playing a role in transcriptional repression and/or 
long-term silencing. Thus these proteins serve as candidates for sequential immunoprecipitation to 
discover target genes actively repressed by certain transcription factors. The proteins covered by the 
present invention for the purposes of identifying repressed or silenced transcription factor target 
genes include, but are in no way limited to the methyl DNA binding proteins MeCPl, MeCP2, 
MBD1, MBD2, MBD3 and MBD4. Other repressor proteins which have yet to be identified may 
also ultimately be targeted for sequential immunoprecipitation to define transcriptional repressor 

28 



SUBSTITUTE SHEET (RULE 26) 



• fi c for the above mentoned m d oto poBte 

^Aandorcrnep^inchromann ^Ispho^oIMstone, steins involved™ 

— B od*cadono,«hissoricov^db ^ C.,HDAC 5 ,HDA«, B DAC 7 ,HDACS»,d 

no „, <o HOACi, *h effecdvely -* — 

^otherasyet—sooveredoruncharae^P 

As evidenced above, it is ft. ~» toa ™ 3> with specific for the core 

^nscnpdonalmachinery which allows for e**n otflfcedby ^present 

^oospecmcfo—o. ^;;i 3 d Santa c«Bio,ec ta o 1 o g v>e.,hise,^ 

and recognize the full length ^ r °^ 1 ° ^^^^^^ can be performed for a variety of dssue and cell 
se^nbaiprectpitadonofcross^ 

^toow target loci for these factors. Whh undiscoveI cd targe, toe. for the 

^dprion factor p 5 3, it is * ^ ^ conKm p, aB d and 

ta present invenhon ,nc,»de, bu, are ^ !--» . ^ ^ ^ mm, AF-1 

ABI4, AC ACE2, ACF, ADAZ, APA3 ADA-^ ^ ^ ^3, A oU4, 

„ 2 AFLF-. ATP1. AFX-1. AG, AG1, AG2, A ^ ^ fla 

. Tr , ^4 AIC5. AID2, AHK3, ALF1B, ALL 1, P ^ ^ha- 

PAL, alphas, ^ AMLlc, AMLlCeltaN, AML2, AML3. 

Ar-Y.^Arn^^AMU.^ Mtp , AP-X, AP-2, AP- _ 

_S^r^s==5====^ 



SUBSTITUTE SHEET (RULE 26) 



WO 02/14550 PCT/USO 1/24823 

ATBP, AT-BP1, AT-BP2, ATF, ATF-1, ATF-3, ATF-3 deltaZIP, ATF-adelta, ATF-like, Athb-1, 
Athb-2, Ato, Axial, AZF1, B factor, B", BAF1, B-TFUD, band I factor, BAP, Barx-1, BAS, BBF1, 
BBF2a, BBF3, BBFa, Bed, BCFI, Bcl-3, BCL-6, BD73, BDF1, beta-1, BETA1, BETA2, beta- 
catenin, beta-factor, BF-1, BF-2, BGP1, Binl, Blimp-1, BmFTZ-Fl, B-Myb, B-Myc, BP1, BP2, B- 
Peru, BR-C Zl, BR-C Z2, BR-C Z4, Brachyury, BRF1, BrlA, Brn-3a, Brn-4, Brn-5, BUF1, BUF2, 
BAF1, BAS1, BCFH, beta-factor, BET A3, BLyF, BP2, BR-C Z3, brachyuray, brahma, BRF1, Brnl, 
Brn2, Brn-3a, Brn-3b, Bm-A, Brn-5, Bro, Btd, BTEB, BTEB2, BUF, BUF1, BUF2, BUR6, byr3, 
BZIP910, BZIP911, c-abl, c-Ets-1, c-Ets-2, c-Fos, c-Jun, c-Maf, c-myb, c-Myc, c-Qin, c-Rel,OEBP, 
C/EBPalpha, C/EBPbeta, C/EBPdelta, C/EBPepsilon, C/EBPgamma, CI, CAC-binding protein, 
CACCC-binding factor, Cactus, Cad, CADI, CAF17, CAL, CAP, CAR2, CArG box-binding 
protein, CAT8, CAUP, CBF1, CBF2, CBF3, CBF4, CBF5, CBF-A, CBF-B, CBF-C, CBP, CBTF, 
CCAAT-binding factor, CCBF, CCF, CCG1, CCK-la, CCK-lb, CCR4, CD28RC, CDC10, Cdc68, 
CDF, cdk2, CDP, CDP2, Cdx-1, Cdx-2, Cdx-3, Cdx-4, CEBF, CEF1, ceh-1, ceh-10, ceh-12, ceh-13, 
ceh-14, ceh-16, CEH-18 and (all ceh related factors), CeMyoD, c-Ets-1, C-Ets-IA, c-Ets-lB, CF1, 
Cfla, CF2-I, CF2-H, CF2-DI, CFF, CG-1, CHA4, CHOP-10, Chox-2.7, ChxlO, CIN5, CIBB1, c- 
Jun, CKB3, Clox, c-Maf, CMB1, CMB2, c-Myb, c-Myc, CNBP, Cnc, CoMPl, core-binding factor, 
CoS, COUP, COUP-TF, CP1, CP1A, CP1B, CP1C, CP2, CPBP, CPC1, CPE binding protein CPRF- 
1, CPRF-2, CPRF-3, CPM10, CPM5, CPM7, CPPI, CPRF-1, CPRF-2, CPRF-3, CPRF-4a, CPRF- 
4b, all CREB related factors, CRE-BP1, CRE-BP2, CRE-BP3, CRE-BPa, CreA, CREB, CREB-2, 
CREBomega, CREMalpha, CREMbeta, CREMdelta, CREMepsilon, CREMgamma, 
CREMtaualpha, CRF.all CRM related factors, Croc, Crx, CRZ1, CSBP-1, CtBP, CTCF, CTF, 
CUM1, CUM10, CUP2, CUP9, CUS1, Cut, Cux, CWH-1, CWH-2, CWH-3, Cx, cyclin A, cyclin T, 
cyclin Tl, cyclin T2, cyclin T2a, cyclin T2b, CYS3, D-MEF2, Da, all DAL related factors, DAP, 
DAPI, DAT1, DAX1, DB1, DBF-A, DBF4, DBP, DBSF, dCREB, DDBJDDB-1, DDB-2, dDP, 
dE2F, DEAP3, DEF, DEFH2, Delilah, delta factor, deltaCREB, deltaEl, deltaEFl, deltaMax, 
DENF, DENF1, DENF2, DENF3, DEP, DEP2, DEP3, DEP4, DERmo-1, DF-1, DF-2, DF-3, Dfd, 
dFRA, DHR3, DHR38, DHR78, DHR96, dioxin receptor, dJRA, Dl, DDI, all Dlx related factors, 
DM-SSRP1, DMLP1, Dof3, DP-1, DP-2, Dpn, Drl, all DREB related factors, DRF1, DRF2, DRTF, 
DSC1, DSD?, DSP1, DST1, DSXF, DSXM, DTF, E, E1A, E2, E2BP, E2F, E2F-BF, E2F-I, E4, E47, 
E4BP4, E4F, E4TF2, E7, E74, E75, EAP1, EAP2L, EAP2S, EAR2, EBF, EBF1, EBNA, EBP, 
EBP40, EC, EG5, ECF, EGF2, ECF3, ECH, ECM22, EcR^E-TF, EF-1 A," EF-G; EFl, EFgamma r - " 
EGM1, EGM2, EGM3, Egr, EGR2, EGR3, eH-TF, EUa, EivF, EKLF, Elf-1, Elg, Elk-1, ELP, Elt-2, 
EmBP-1, embryo DNA binding protein, Emc, EMF, EMF2, EMF3, EMF4, Ems, Emx, Emx-1, Emx- 
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2, En, ENH-binding protein, ENKTF-1, epsilonFl, ER, Erbeta, EREBP-1, EREBP-2, EPvEBP-3, 

EREBP-4, ERF1, Erg, Esc, Escl, esg, Esx-la, Esx-lb, ETF, ETL, Eve, Evi, Evx, Exd, Ey, en-1, en- 
2, f(alpha-f(epsilon), F27E5.2, F2F, FACB, F-ACT1, factor 1, factor 2, factor 3, factor Bl, factor 
B2, factor delta, factor I, FAR, Fbfl, FBF-A1, FBP, FBP1, FBP11, FBP2, FBP6, FBP7, f-EBP, 
FHL1, FIM, FKBP59, Fkh, FKH1, Fkh-1, FKH2, Fkh-2, FIch-3, Fkh-4, Fkh-5, Fkh-6, FKHR, 
FKHRL1, FKHRL1P1, FKHRL1P2, FKHRP1, FlbD, PLC, FLF, Flh, Fli-1, FLO, FL08FLV-1, 
FOG, FosB, FosB/SF, Fra-1, Fra-2, Freac-1, Freac-10, Freac-2, Freac-3, Freac-4, Freac-5, Freac-6, 
Freac-7, Freac-8, Freac-9, FRG Yl, FRG Y2, FTF, FTS, Ftz, FTZ-F1, FTZ-FlbetaJFZFIG factor, G 
factor, G/HBF-1, G10BP, G6 factor, GA-BF, GABP, GABP-alpha, GABP-betal, GABP-beta2, 
GAF, GAF1, GAF2, GAG2, GAL11, GAL4, GAL80, GammaCAAT, gammaCACl, gammaCAC2, 
gamma-factor, gammaOBP, GAMYB, GAT1, GAT2, GAT3, GAT4, GATA-1, GATA-1A, GATA- 
1B, GATA-2, GATA-3, GATA-4, GATA-5, GATA-5A, GATA-, GATA-6, GATA-6A, GATA-6B, 
GBF, GBF1, GBF12, GBF1A, GBF1B, GBF2, GBF2A, GBF2B, GBF3, GBF4, GBF9, GBP, GC1, 
GC2, GC3, GCF, GCM, GCMa, GCMb, GCN4, GCN5, GCNF, GCR1, GCR2, GE1, GEBF-I, GF1, 
GFI, Gfi-1, GFH, GHF3, GHF-5, GHF-7, GIS1, GKLF, GL1, G115, G12, Glass, GLI, GLI3, GLN3, 
GLO, GM-PBP-1, GP, GR, GR alpha, GR beta, GRF-1, Grg-4, Grg-5, GRIP1, Groucho, Gsb, 
GSBF1, Gsbn, Gsc, Gsc A, Gsc B, Gt, GT-1, GT-2, GT-IC, GT-IIA, GT-flBalpha, GT-UBbeta, 
GTS1, Gtx, GZF3, H16, H1TF1, H1TF2, H2B abp 1, H2RHBP, H4TF-1, H4TF-2, HAC1, HAL9, 
HALF-1, HAP1, HAP2, HAP3, HAP4, HAP5, Hb, HB9, HBLF, HBP-1, HBP-la, HBP-la(l), HBP- 
la(cl4), HBP-lb, HBP-lb(cl), HCM1, HDaxx, heat-induced factor, HEB, HEBl-p67, HEBl-p94, 
HEF-1B, HEF-1T, HEF-4C, HEN1, HEN2, HeRunt-1, HES-1, HES-2, HES-3, HES-5, Hesxl, Hex, 
HFH-1, HFH-11A, HFH-11B, HFH-2, HFH-3, HFH-4, HFH-5, HFH-6, HFH-7, HFH-8, HDP-1, 
HIF-lalpha, HIF-lbeta, HiNF-A, HiNF-B, HiNF-C, HiNF-D, HiNF-D3, HiNF-E, HlNF-M, HLNF-P, 
HIP1, HIR1, HIR2, HIR3, HERA, HTV-EP2, Hf, Hlf-alpha, Hlf-beta, HLX, Hlx, HMBP, HMG I, 
HMG I(Y), HMG Y, HMGI-C, HMS1, HMS2, HNF-1, HNF-1A, HNF-1B, HNF-1C, HNF-3, 
HNF3(-like), HNF-3alpha, HNF-3B, HNF-3beta, HNF-3gamma, HNF-4, HNF-4(D), HNF-4alphal, 
HNF-4alpha2, HNF-4alpha3, HNF-4alpha4, HNF-4alpha7, HNF-4beta, HNF-4gamma, HNF-6, 
HNF-6alpha, HNF-6beta, hnRNP K, Hoxll, HOXA1, HOXA10, HOXA10 PL2, HOXA11, 
HOXA13, HOXA2, HOXA3, HOXA4, HOXA5, HOXA6, HOXA7, HOXA9, HOXB1, HOXB2, 
HOXB3, HOXB4, HOXB5, HOXB6, HOXB7, HOXB8,HOXB9, HOXC10, HOXC11, HOXC12, 
-HOXei3; HOXC4, HOXC5, HOXC6; HOXC6 (PRI), HOXC6 (PRH), HOXC8, HOXC9, HOXDl, 
HOXD10, HOXD11, HOXD12, HOXD13, HOXD3, HOXD4, HOXD8, HOXD9, HP1 site factor, 
Hp55, Hp65, HrpF, HSE-binding protein, HSF, HSF1, HSF2, HSF24, HSF30, HSF8, hsp56, 
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Hsp90, HST, HSTF, HY5, IBF, IBP-1, ffiR, ICER, ICER-I, ICER-Igamma, ICER-H, ICER- 

Iigamma, ICP4, ICSBP, HI, Idl.25, IdlH', Id2, Id3, Id3 / Heir-1, Id4, IDS1, IE1, ffiBPl, IEFga, 

IF1, EF2, IFH1, IFNEX, IgPE-1, IgPE-2, IgPE-3, Hc-1, Ec-2, Ik-3, Ec-4, Ik-5, Ik-6, Dc-7, Ik-8, 

IkappaB, IkappaB-alpha, IkappaB-beta, EkappaB-gamma, IkappaB-gaimnal, BcappaB-gamma2, 

HcappaBR, IKI3, ELF, ILRF-A, IME1, IME4, IN02, IN04, INSAF, IPF1, 1-POU, IRBP, IRE-ABP, 

IREBF-1, IRF-1, IRF-2, IRF-3, irlB -2a, Irx-3, ISGF-1, ISGF-3, ISGF-3alpha, ISGF-3gamma, Isl-1, 

ISRF, ISRFI, ITF, 1TF-1, ITF-2, IUF-1, Ixrl, JRF, Jun-D, JunB, JunD, K06B9.5, K07C11.1, kappaY 

factor, KAR4, KBF2, kBF-A, KBP-1, KCS1, KER1, -1, Kid-l, Kinl7, KN1, Kni, Knox3, KNRL, 

Koxl, Kr, Kreisler, KRF-1, Krox-20, Krox-24, Ku autoantigen, KUP, Lab, LAC9, LBP, LBP-1, 

LBP-la, Lc, LCR-F1, LD, Ldbl, LEF-1, LEF-1B, LEF-1S, LEU3, LF-A1, LF-A2, LF-B2, 

LF-C, LFY, LG2, LH-2, Lhx-3, Lhx-3a, Lhx-3b, Lhx-4, LHY, Lim-1, Lim-3, lin-1, lin-11, lin-14A, 

lin-14Bl, lin-14B2, lin-29A, lin-29B, lin-31, lin-32, lin-39, LIP15, LEP19, LTT-1, LKLF, Lmol, 

Lmo2, Lmx-1, L-Mycl, L-Myc-1, L-Myc-l(long form), L-Myc-1 (short form), L-Myc-2, LR1, LSF, 

LSIRF-2, LUN, Lva, LVb-binding factor, LVc, LXRalpha, LyF-1, Lyl-1, LYS14, Lz, M factor, M- 

Twist, Ml, m3, Mab-18, MAC1, Mad, MAF, MafB, MafF, MafG, MafK, Mal63, MAPF1, MAPF2, 

MASH-1, MASH-2, mat-Mc, mat-Pc, MATal, MATalphal, MATalpha2, MATH-1, MATH-2, 

Maxl, M factor, Ml, m3, Mab-18 (284 AA), Mab-18 (296 AA), mab-5, MAC1, Madl, Mad3, 

Mad4, MADS1, MADS11, MADS 16, MADS2, MADS24, MADS3, MADS4, MADS45, 

MADS5, MADS6, MADS7, MADS8, MADS9, MAF, MafB, MafF, MafG, MafK, MAL13, 

MAL23, MAL33, MAL63, MAPF1, MAPF2, MASH-1, MASH-2, Matl-Mc, MATal, MATalphal, 

MATalpha2, MATH-1, MATH-2, mat-Pc, Max, Maxl, Max2, MAZ, MAZi, MB67, MBF1, MBF-1, 

MBF2, MBF3, MBF-I, MBP1, MBP-1 (1), MBP-1 (2), MBP-2, MCBF, MCM1, 

MCMl+MATalphal , MDBP, MDBP-2, MDS3, mec-3, MECA, MED11, MED2, MED4, MED6, 

MED7, MED8, mediating factor, MEF1, MEF-2, MEF-2B, MEF-2B-1, MEF-2B-2, MEF-2B-3, 

MEF-2B-4, MEF-2C, MEF-2C (433 AA form), MEF-2C (465 AA form), MEF-2C (473 AA form), 

MEF-2C/delta32 (441 AA form), MEF-2D, MEF-2D (506 AA form), MEF-2D (514 AA form), 

MEF-2D00, MEF-2D0B, MEF-2DA-*0, MEF-2DA-B, MEF-2DA0, MEF-2DAB, Meis-1, Meis-1-1, 

Meis-1-2, Meis-1-3, Meis-1-4, Meis-la, Meis-lb, Meis-2a, Meis-2b, Meis-2c, Meis-2d, Meis-3, 

Mesol, METIS, MET28, MET31, MET32, MET4, Mf2, MF3, MFH-1, Mfh-1, MGA1, Mhox, 

MHR1, Mi, MIBP1, MIF-1, MIG1, MIG2, Mix.l, Mix.2, Mix.3, Mix.4, Mixer, MIXTA, Miz-1, 

- MKR2, MtP, MM=-r, MNB1 a; MNB lbrMNFT; "MNR2; MOK-2," MOP3 , MGT1 , MOT3V MP4", " 

MPBF, MR, MRF4, MRR, Msh, MSN1, MSN2, MSN4, Msx-1, Msx-2, MTB-Zf, MTF1, MTF-1, 

MTH1, MtU, mtTFl, M-Twist, muEBP-B, muEBP-C2, MUF1, MUF2, Mxil, MYB A, MYB.PH1, 
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MYB.PH2, MYB.PH3, MYB1, Myb-l, all Myb related proteins, MYB-P1, MYBST1, myc-CFl, 
myc-PRF, MYC-RP, Myef-2, Myf-3, Myf-4, Myf-5, Myf-6, Myn, MyoD, Myogenin, MZF-1, 
Nabl, Nau, NBF, NCI, NCB2, NDT80, NELF, NePl, NER1, Net, NeuroD, NF ffl-a, NF HI-c, NF 
m-e, NF-1, NF-l/L, NF-l/Redl, NF-1A, NF-1A1, NF-1A1.1, NF-1A2, NF-1A3, NF-1A4, NF-1A5, 
NF-1B, NF-1B1, NF-1B2, NF-1B3, NF-1B4, NF-1C1, NF-1C2, NF-1C4, NF-1X, NF-1X1, NF-1X2, 
NF-1X3, NF2d9, NF-4FA, NF-4FB, NF-4FC, NF-A, NF-A3, NF-AB, NFalphal, NFalpha2, 
NFalpha3, NFalpha4, NF-AT, NFAT-1, NF-AT3, NF-Atc, NF-ATc3, NF-Atp, NF-Atx, NF-BA1, 
NfbetaA, NF-CLEOa, NF-CLEOb, NF-D, NFdeltaESA, NFdeltaE3B, NFdeltaESC, NFdeltaE4A, 
NFdeltaE4B, NFdeltaE4C, Nfe, NF-E, NF-Elb, NF-E2, NF-E2 p45, NF-E3, NF-E4, NFE-6, NF- 
EM5, NF-Gma, NF-GMb, NF-H1, NF-H2, NF-H3, NFH3-1, NFH3-2, NFH3-3, NFH3-4, NF-IL-2A, 
NF-IL-2B, NF-InsEl, NF-InsE2, NF-InsE3, NF-jun, NF-kappaB, NF-kappaB(-like), NF-kappaBl, 
NF-kappaBl precursor, NF-kappaB2, NF-kappaB2 (p49), NF-kappaB2 precursor, NF-kappaEl, NF- 
kapp aE2, NF-kappaE3, NF-lambda2, NF-MHCDA, NF-MHCKB, NF-muEl, NF-muE2, NF-muE3, 
NF-muNR, NF-ODC1, NF-S, NF-TNF, NF-U1, NF-W1, NF-W2, NF-X, NF-X1, NF-X2NF-X3, 
NF-Xc, NF-Y, NF-Y', NF-YA, NF-YB, NF-YC, NF-Zc, NF-Zz, NGFI-B, NGFI-C, NHP-1, NHP- 
2NHP3, NHP4, NHR1, NIP, NIRA, NIT2, NTT4, Nkx-2.1, Nkx-2.2, Nkx-2.5, NLS1, NMH7, 
NMHC5, Nmi, N-Myc, N-Mycl, N-Myc2, nob-lA, nob-IB, N-Oct-2alpha, N-Oct-2beta, N-Oct-3, 
N-Oct-4, N-Oct-5a, N-Oct-5b, NOR1, NOT, NOT1, NOT2, NOT3, NOT5, NP-HI, NP-IV, NP-TCT, 
NP-Va, NPX1, NRD I, Nrf 1, NRF-1, Nrf2, NRF-2NRF-2betal , NRF-2gammal, NRFA, NRG1, 
NRG2,'nRL,NS-1 ) NSDD ) NTF,NTF1,NUC-1, Nur77,NUTl,NUT2,OBF,OBF-l,OBF3.1, 
OBF3.2, OBF4, OBF5, OBP, OBP1, OC-2, OCA-B, OCSBF-1, OCSTF, Oct-1, Oct-10, Oct-11, 
Oct-IA, Oct-IB, Oct-lC, Oct-2, Oct-2.1, Oct-2.3, Oct-2.4, Oct-2.6, Oct-2.7, Oct-2.8, Oct-2B, Oct- 
2C, Oct-4, Oct-4A, Oct-4B, Oct-5, Oct-6, Oct-7, Oct-8, Oct-9, Octa-factor, octamer-binding factor, 
oct-B2, oct-B3, Oct-R, Odd, ODR7, OG-12, OG-2, OG-9, OHP1, OHP2, Olf-1, OM1, ONR1, 
Opaque-2, OPM1, OSBZ8, Otd, Otxl, Otx2, Otx4, Ovo, OZF, P Gong form), P (short form), PI, 
P 107, pl30, p28 modulator, P 300, P 38erg, p40x, p45, p49erg, P 53as, p55, p55erg, p58, p65delta, 
p67, PAB1, PacC, PAF1, pag-3, PAGL1, pal-1, Papl+, par-2, Paraxis, PARP, Pax-1, Pax-1/9, Pax- 
1/9 (AmphiPax-1), Pax-l/9-I, Pax-l/9-H, Pax-l/9-IH, Pax-l/9-TV, Pax-l/9-V, Pax-l/9-VI, Pax-2, 
Pax-2.1, Pax-2.2, Pax-2/5/8, Pax-2a, Pax-2b, Pax-3, Pax-3A, Pax-3B, Pax-4, Pax-4a, Pax-4b, Pax- 
_4.c,.Pax^i,.Pax,5, Pax-6, Pax,6.(Pax.QNR), Pax-6 /,Pd,5a, Pax,6 l-2.1,.Pax-.6a2.2,Pax.6 4.1,Pax.6-- 
4.2, Pax-6 J2, Pax-7, Pax-8, Pax-8a, Pax-8b, Pax-8c, Pax-8d, Pax-Se, Pax-8f, Pax-8g, Pax-9, Pax-A, 
Pa^-B, Pb, PBF, PBP, Pbx-la, Pbx-lb, Pc, PC2, PC4, PC4 p9, PC5, Perl, PCRE1, PCT1, PDM-1, 
PDM-2, PDR1, PDR3, Pdx-1, PEA1, PEA2, PEA3, PEB1, PEBP2, PEBP2alpha, 
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PEBP2alphaA/Osf2, PEBP2alphaA/til-l, PEBP2alphaA/til-l (Y), PEBP2alphaA/til-l(U), 

PEBP2alphaAl, PEBP2alphaA2, PEBP2alphaBl, PEBP2alphaB2, PEBP2beta, PEBP2betal, 
PEBP2beta2, PEBP2beta3, PEBP5, Pep-1, PERIANTHIA, pes-lapes-lb, PF1, PF3, PGA4, PGD1, 
pha-4, PHAN, PHD1, phiAP3, PH02, PH04, PHO80, Phox-2, php-3, PI, PI1, PI2, pie-1, PIHbox9, 
PIP2, Pit-1, Pit-la, Pit- lb, Pit-lc, Pitx-3, PLE, PLE/DEFH200, PLE/DEFH49, PLE/DEFH72, 
PLE/SQUA, PLZF, PNPI2, PO-B, pointedPl, pointedP2, Pontin52, pop-lPOP2, POTM1-1, pou[c], 
Pou2, pox neuro, PP1, PP2, PPAR, PPARalpha, PPARbeta, PPARgamma, PPR1, PPUR, PPYR, PR, 
PR A, PRb, Prd, PRDI-BF1, PRDI-BFc, PREB, Prop-1, protein a, protein b, protein c, protein d, 
PRP, PSE1, Psx-1, Psx-2, P-TEFb, PTF, PTF1, PTFl-alpha, PTFl-beta, PTFalpha, PTFbeta, 
PTFdelta,, PTFgamma, Ptx-1, Ptx-2, Ptx-2B, Pu box binding factor, Pu box binding factor (BJA-B), 
PU.l, Pu.1, PUB1, PuF, PUF-I, Pur factor, Pur-1, PTJT3, P-wr, PX, PZF1, qa-lF, QBP, QUT1, R, 
Rl, R2, RAD1, Rad-1, RAD18, RAD2, RAF, RAP1, RAP2.5, RAR, RAR-alpha, RAR-alphal, 
RAR-alpha2, RAR-beta, RAR-betal, RAR-beta2, RAR-beta3, RAR-beta4, RAR-gamma, RAR- 
gammal, RAR-gamma2, RAVI, RAV2, Rax, Rb, RBP60, RBP-Jkappa, Rc, RC1, RC2, RCS1, 
REB, REB1, Reblp, RelA, RelB, repressor of CAR1 expression, REV-ErbAalpha, REX-1, RF1, 
RF2a, RFX, RFX1, RFX2, RFX3, RFX5, RF-Y, RGM1, RGR1, RGT1, RIC1, RTM1, RIP14, RITA- 
1, RLM1, RME1, RMS1, Ro, Roaz, ROM1, ROM2, RORalphal, RORalpha2, RORalpha3, 
RORbeta, RORgamma, Rox, Roxl, ROX3, RPF1, RPGalpha, RPH1, RREB-1, RRF1, RRF2, RRF3, 
RRN10, RRN11, RRN3, RRN5, RRN6, RRN7, RRN9, RS2, RSC4, RSRFC4, RSRFC9, RSV-EF-H, 
RTF1, RTG1, RTG2, RTG3, Runt, RVF, Rx, Rxl, Rx2, Rx3, RXR-alpha, RXR-beta, RXR-betal, 
RXR-beta2, RXR-gamma, S8, SAP1, SAP-la, SAP-lb, SBF, SBF-1, Sc, SCBPalpha, SCBPbeta, 
SCBPgamma, SCD1/BP, SCM-inducible factor, Scr, S-CREM, S-CREMbeta, Sd, Sdc-1, SDS3, 
SEF1, SEF-1 (1), SEF-1 (2), SEF3,SEF4, SEM-4, SET1.SET2, SF1, SF-1, SF-2, SF-3, SF-A, SFL1, 
SGC1, SGF-1, SGF-2, SGF-3, SGF-4, Shn, SHP, SHP1, SHP2, SIF, SIG1, SHI, SIE-pllO, Sm-pl5, 
Sm-pl8, Siml, Sim2, Six-1, Six-2, Six-3, Six-3alpha, Six-3beta, Six-4, Six-4A, Six-4B, Six-4C, 
Six-5, Six-6,Skn-l, SKN7, SKOl, SLM1, SLM2, SLM3, SLM4, SLM5, Slpl,slp2, S-Myc, Sn, SN 
(sienna), Sna, SNF5, SNF6, SNP1, So, SOX-11, SOX-12, Sox-13, SOX-15, Sox-18, Sox-2, Sox-4, 
Sox-5, SOX-6, SOX-9, Sox-LZ, Spl, Sp2, Sp3, Sp4, SPA, spE2F, Sph factor, Spi-B, SpOtx, Sprm- 
1, SpRunt-1, SQUA, SRB10, SRB11, SRB2, SRB4, SRB5, SRB6, SRB7, SRB8, SRB9, SRD1, SRE 
BP, SREBP-1, SREBP-la, SREBP-lb, SREBP-lc, SREBP-2, SREP, SRE-ZBP, SRF, SRY, Sry h-1, 
-Sry-beta, Sry deltavSsDBP-1 ; ssDBP-2, SSRP1 , Staf, Staf-50, STAT, STAT1, STATlalpha, ™ 
STATlbeta, STAT2, STAT3, STAT4, STAT5, STAT5A, STAT5B, STAT6, STC, STDl, Stell, 
STE12, STE4,STF1, STF2, STKA, STM, STP1, Stral3, StuAp, su(f), Su(H),su(Hw), SUM-1, 
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SUP.SVP, SVP46, SWI/SNF complex, SWI1, SWI2, SWI3, SWI4, SWI5, SWI6, SWP,T-Ag, t- 
Pou2, T3R, T3R-alpha, T3R-alphal, T3R-alpha2, T3R-beta, T3R-betal, T3R-beta2, TAB, T-Ag, 
TAG1, Tal-1, Tal-lbeta, Tal-2, TAR factorTat, Tax, TCF, TCF-1TCF-1A, TCF-1B, TCF-1C, TCF- 
1D, TCF- IE, -IF, TCF-1G, TCF-2, TCF-2alpha, TCF-3, TCF-3B, TCF-3C, TCF-3D, TCF-4, TCF- 
4(K), TGF-4B, TCF-4E, TCF-A, TCF-B, TCFbetal, TDEF, TEA1, TECl, TEF, TEF 1, TEF-1, 
TEF2, TEF-2, Tel, TF68, TFE3, TFE3-L, TFE3-S, TFEB, TFEC, TFIIA, TFELA (13.5 kDa subunit), 
Tf-LFl, Tf-LF2, TF-Vbeta, TGA,TGA1, TGAla, TGA2, TGA3, TGA6, TgFl, TGGCA-binding 
protein, TGT3, Thl, THM1, THM18, THM27, THRA1, T3F1, TDF2, TTN-1, TINY, TIP, tl-POU, 
TLE1, Til, Tlx, TM3, TM4.TM5, TM6, TM8, TMF, t-Pou2, TR2, TR2-11, TR2-9, TR3, TR4,Txa-l 
Gong form), Tra-1 (short form), TRAP, TREB-1, TREB-2, TREB-3, TREF1, TREF2, TRF, TRF (2), 
Trident, TSAP, TSF3, Tsh,TTF-l, TTF-2, TTG1, Ttk 69K, Ttk 88K, TTP.Ttx, ttx-3, TUBF, Twi, 
TxREF, TyBF, UAY, UBF, UBF1, UBF2, UBP-1, Ubx, UCRB, UCRF-L, UEF-1, UEF-2, UEF-3, 
UEF-4, UF1-H3beta, UFA, UFB, UFO, UGA3, UHF-1, UME6, unc-30, unc-37, unc-4, Unc-86, 
URF, URSF, URTF, USF, USF2, vab-3, vab-7, vaccinia virus DNA-binding protein, Vav, Vax-1, 
Vax-2, VBP, VDR, v-ErbA, VETF, v-Ets, v-Fos, vHNF-1, vHNF-lA, vHNF-lB, vHNF-lC, VITF, 
v-Jun, v-Maf, Vmw65, v-Myb, v-Myb/v-Ets, V-Myc, v-Myc, Vpl, Vpr, v-Qin, v-Rel, VSF-1, WC1, 
WC2, Whn, WT1, WT1I, WZF1, X-box binding protein, X-Twist, X2BP, xaml, X-box binding 
protein, XBP-1, XBP-2, XBP-3, XF1, XF2, XFD-1, XFD-2, XFD-3, XFG20, XGRAF, Xirol, 
Xiro2, Xiro3, xMEF-2, XPF-1, XrpFI, XW, XX, yan, YB-1, YB-3, Ybx-3, YEB3, YEBP, Yi, 
YNG2, YPF1, YY1, ZAP, ZEB, ZEM1, ZEM2/3, Zen-1, Zen-2, Zeste, ZF1, ZF2, ZF5, Zfh-1, Zfh-2, 
Zfp-35, ZID, ZIP-1A, ZDP-2A, ZIP-2B, ZM1, ZM38, Zmhoxla, Zn-15, ZNF174, ZPT2-1, ZPT2-2, 
ZPT2-3, ZPT2-4, Zta. In addition, any factors which retain the ability to regulate gene expression, 
either through activation or repression, and are as of yet previously undiscovered or as 
uncharacterized are covered by the present invention. 

While the procedure of sequential irnmunoprecipitation of cross-linked protein/DNA 
complexes for purposes of detecting actively transcribed target genes in the presently described 
invention involves the sequential precipitation of protein/DNA complexes utilizing antibodies 
specific to the large subunit (c) of RNA polymerase H first and antibodies specific for the 
transcription factor of interest second, it is in no way limited to this particular order of 
irnmunoprecipitation. It is=eontemplated-by the present invention that the irnmunoprecipitation •— 
procedure may be reversed and thus performed with antibodies specific for the transcription factor of 
interest first and antibodies specific for the large subunit of RNA polymerase H second, although it 
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is possible that a loss of target loci recovery may result due to initial precipitation of genes not 
activated by said transcription factor of interest 



It is also contemplated and covered by the presently described invention that sequential 
rounds of irnmunoprecipitation may be performed with antibodies specific to cell type and tissue 
restricted transcription factors for the purposes of identifying target genes for multiple factors. For 
example, the technology described herein may be utilized to search for loci which are targets for 
regulation by both p53 and Rb, or by both Pit-1 and GATA2 (El-Diery et al., Cell, 1993, 75: 817- 
825; Dasen et al., Cell, 1999, 97: 587-598). In addition, it is contemplated by the present invention 
that coirnmunoprecipitation utilizing antibodies specific for more than one transcription factor 
simultaneously may be successfully performed for the purposes of identifying target loci for two or 
more transcription factors. 



5.4 Solid Phase Chromosomal Tmmimnp rp cipitation Increases both Yield and Sensitivity 

The presently described invention utilizes magnetic beads linked covalently to either 
monoclonal or polyclonal antibodies specific for discrete and particular transcription factors (Dynal 
Corporation). It is clear that by implementing solid phase separation techniques for 
irnmunoprecipitation both the amount of material recovered as well as the specificity for real in vivo 
interactions is considerably enhanced. This is due primarily to the increased ability to recover the 
protein/DNA complexes of limited quantity and implementation of additional washing procedures as 
compared to irnmunoprecipitation in the absence of using a solid phase base. A diagrammatic 
illustration of the use of solid phase technology to increase yield and sensitivity is represented in 
Figure 3. Cross-linked DNA/protein material is combined with magnetically charged Dynabeads 
upon which antibodies to the protein of interest have been conjugated. Use of a magnet results in 
purification of protein/DNA complexes of interest. Subsequent washing steps allow for the removal 
of the unbound cellular debris, proteins and DNA fragments. Magnetic bead/protein/DNA 
complexes are subsequently subjected to further analysis as discussed below. In the presently 
described invention linkage of antibodies to the solid phase support magnetic beads is accomplished 
via standard protocol (Dynal Corporation product information and specifications) and those known 
and-skmed-in me -art-are capable of e^ 

in phosphate buffered saline (PBS), pH 7.4. 0. l-1.5ug of antibody is added per ml of beads, the 
volume adjusted and the mixture incubated for 24 hours at 4 deg. C on a rotating shaker. The beads 
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are subsequently collected in test or eppendorph tubes via a magnet and the supernatant removed. 
After two more rounds of washing in lOmM Tris-HCl, pH 7.6 for an additional 16-24 hours the 
bead/antibody complex is ready for sequential immunoprecipitation of protein/DNA complexes. 

The particular magnetic beads utilized as a solid phase supporting material in the presently 
described invention are Dynabeads M-450 Tosylactivated (Dynal Corporation). Other magnetic 
beads contemplated by the present invention and created by Dynal Corporation which may be 
utilized as a solid phase support for the chromosomal immunoprecipitation reaction described herein 
include Dynabeads M-450 uncoated, Dynabeads M-280 Tosylactivated, Dynabeads M-450 Sheep 
anti-Mouse IgG, Dynabeads M-450 Goat anti-Mouse IgG, Dynabeads M-450 Sheep anti-Rat IgG, 
Dynabeads M-450 Rat anti-Mouse IgM, Dynabeads M-280 sheep anti-Mouse IgG, Dynabeads M- 
280 Sheep anti-Rabbit IgG, Dynabeads M-450 sheep anti-Mouse IgGl, Dynabeads M-450 Rat anti- 
Mouse IgGl, Dynabeads M-450 Rat anti-Mouse IgG2a, Dynabeads M-450 Rat anti-Mouse IgG2b, 
Dynabeads M-450 Rat anti-Mouse IgG3. Other magnetic beads which are also contemplated by the 
present invention as providing utility for the purposes of sequential immunoprecipitation include " 
streptavidin coated Dynabeads. 



While the presendy described invention employs magnetic beads as the solid phase to 
increase yield and recovery of protein/DNA complexes during sequential chromosomal 
immunoprecipitation, it is in no way the only solid phase support system which may be implemented 
successfully to increase yield and sensitivity. Other solid phase supports contemplated by the 
present invention include, but are not limited to, sepharose, chitin, protein A cross-linked to agarose, 
protein G cross-linked to agarose, agarose cross-linked to other proteins, ubiquitin cross-linked to 
agarose, thiophilic resin, protein G cross-linked to agarose, protein L cross-linked to agarose and any 
support material which allows for an increase in the efficiency of purification of protein/DNA 
complexes. 



An alternative method of attaching antibodies to magnetic beads or other solid phase support 
material contemplated by the present invention is the procedure of chemical cross-linking. Cross- 
linking of antibodies to beads may be performed by a variety of methods but may involve the 
—nitihzationof achenncahreagent which facilitates the attachment of -the-antibody to" the bead" — > 
followed by several neutralization and washing steps to further prepare the antibody coated beads for 
sequential immunoprecipitation. 
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Yet another method of attaching antibodies to magnetic beads contemplated by the present 
invention is the procedure of UV cross-linking. A third method of attaching antibodies to magnetic 
beads contemplated by the present invention is the procedure of enzymatic cross-linking. 

The presently described invention implements a solution of solid material in conjunction with 
antibody/protem/DNA complexes, yet other methodology, such as that which utilizes a column 
support fixture rather than a solution format may be successfully employed for purposes of solid 
phase sequential chromosomal immunoprecipitation. In addition, support fixtures such as petry 
dishes, chemically coated test tubes or eppendorph tubes which may have the capability to bind 
antibody coated beads or other antibody coated solid phase support materials may also be employed 
by the present invention. 



use 

c 



In the presently described invention the superparamagnetism of the beads allows for the 
of a conventional magnet to separate bead/antibody/protein/DNA complexes from nonspecific 
interactions present with the reaction mixture. The magnetic property of the bead is due to the 
presence of Y Fe J 0 3 and Fe 3 0, found within the bead (Dynai Corporation product information and 
specifications), although it is also contemplated by the present invention that a number of other 
materials possessing magnetic properties may be sufficient to confer an ability for efficient 
separation of beacVanubody/protein/DNA complexes from nonspecific materials in the reaction 



mixture. 



Upon sufficient isolation of protein/DNA complexes utilizing solid phase sequential 
immunoprecipitation technologies described in the present invention it is necessary to reverse cross- 
linkages so that UNA fragments containing transcription factor target genes may be precipitated and 
further manipulated through both standard and modified molecular biology procedures. Those 
known and skilled in the art are capable of successfully reversing cross-linkages via conventional 
chromosomal immunoprecipitation protocols. Reversal of cross-linkages is accomplished through 
an incubation of the isolated protein/DNA complexes at high temperatures, preferably 65 deg. C for 
24 hours. It is also contemplated by the present invention (hat alternative temperatures may be 

anticipated that temperatures higher or lower than 65 deg. C may also result in a reversal of cross- 
links. It is speculated that any temperature above 37 deg. C may, to a certain degree, result in the 
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reversal of chemically driven cross-links. In addition, it is contemplated by the present invention 

that reversal of cross-linkages through chemical methods such as alkali treatment as well as UV or 

enzymatic manipulation may be implemented successfully and are covered by the presently 

described invention for the purposes of solid phase sequential immunoprecipitation and ultimately 

transcription factor target gene discovery. 



Precipitation of DNA fragments containing potential transcription factor target loci is 
performed in the presently described invention through the use of a typical salt and ethanol mixture 
(Ausubel et al. (editors), Current Protocols in Molecular Biology . 1994, Chapter 2, pl-3). Those 
known and skilled in the art of standard molecular biology procedures are capable of DNA fragment 
precipitation and collection. It is contemplated that the salt may be omitted without a significant loss 
in sample recovery. In addition, for the purposes of the presendy described invention a coprecipitant 
is added which allows for visualization of the DNA pellet after precipitation and centrifugation. The 
coprecipitant Pellet Paint R (Novagen Corporation) has been successfully employed in the present 
invention for purposes of precipitated DNA visualization and increased recovery (Lunyak et al., 
Innovations , 2001, 12: 4-5). It is also contemplated by the presently described invention that other 
coprecipitants may be effectively used to prevent sample loss and increase yield. Polyethylene 
glycol (PEG) and yeast RNA or any other coprecipitant which effectively acts as a carrier or allows 
for visualization of the DNA may also be used to accomplish increased yield and minimization of 
sample loss and are covered by the present invention. 

5.5 Modified Inverse PCR in Com bination with Sequential Solid Phase C hromosomal 
Tmmunoprecipitation Allows for th e Discovery of Direct Transcription F actor Targets 

Since its inception by Kary Mullis almost 20 years ago, the polymerase chain reaction (PCR) 
has had a revolutionary impact on the study of DNA in general as well as gene expression in 
particular (Innis et al., Academic Press, 1990; McPherson et al., IRL Press, 1991; Erlich, A. 
Stockton Press, 1989; PCR is also illustrated in U.S. patent numbers 4,683,195, 4,683,202, 
4,800,159, 4,965,188, 5,023,171, 5,066,584, 5,075,216, 5,091,310, 5,104,792 which are herein 
incorporated by reference). One significant hmitation of PCR is the requirement of the knowledge 
of nucleotide sequence in order to generate products of imknown sequence'intemal to-those'of' 1 " 



known origin. The inability to retrieve sequences external to known sequences was overcome by the 
development of inverse PCR (I-PCR) a method in which circularized DNA provided a template for 
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the successful amplification of sequences of unknown nucleotide composition external to those 
known (Ochman et al., Genetics, 1988, 120(3): 621-623). Thus it is now possible to retrieve 
nucleotide sequences adjacent to those of known composition for the purposes of assessing the 
identity of template DNA. 

A modified version of the I-PCR technology is described in the present invention which takes 
advantage of the fact that for many transcription factors the binding site, or at least a consensus 
binding site, has been characterized through methods such as binding site selection (Ausubel et al. 
(editors), Current Protocols in Molecular Biology 1994, Chapter 12, pll/1-11/6). By utilizing 
degenerate oligonucleotides specific for the binding site of a transcription factor in the format of 
inverse PCR it is possible to generate flanking sequence information which may aid in determining 
if the template in question is a target gene for the transcription factor being studied. More 
specifically, it is now possible to determine if the template is a direct target for transcriptional 
regulation by the transcription factor being studied. Figure 9 illustrates the strategy behind the 
modified I-PCR technology described herein. 

While a modified version of I-PCR is described in the present invention, other PCR 
technologies may be implemented for the rapid retrieval of transcription factor target genes from 
immunoprecipitated cross-linked protein/DNA complexes as well as DNA purified from these 
complexes upon reversal of cross-linkages. Other PCR amplification technologies contemplated to 
be combined with solid phase sequential immunoprecipitation and therefore covered by the present 
invention include, but are in no way limited to RT-PCR, 5' RACE (Rapid Amplification of cDNA 
Ends), 3' RACE, nested PCR, degenerate oligonucleotide PCR, PCR using oligos coding for 
transcription factor binding sites in combination with oligos coding for sequences proximal to the 
transcriptional initiation site such as the TATAA box, and any PCR technology which aids the 
presently described invention for the purposes of identifying both known and unknown transcription 
factor target loci. 

5.6 Combination of Solid Phase Technology. Sequential Chrom osomal TTT,™n n oprecipitarion 
Modified I-PCR and other Molecular Cloning Methods for th , e Purposes of Transcrip tion Factor 
-Target Gene-Pi s^overy— ^ — . ^ _. _ 
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The sensitivity of the methodology relies heavily on the availability of high-quality 

monoclonal or polyclonal antibodies that can immunoprecipitate the antigen of interest in an 

efficient and specific manner. The current technology described herein details a method which 

utilizes antibody coated magnetic beads in combination with the use of a coprecipitant for the 

precipitation of chosen antigens/DNA complexes with high efficiency and with virtually no 

background (Figure 3). Strategies implemented for the further reduction in background nonspecific 

binding are discussed below. 

Originally, the ChIP technology was designed to characterize protein interactions with 
known DNA target sequences. The methods described herein have been recently developed as to 
extend the ChIP assay to high-throughput cloning and analysis of both known and unknown 
sequences, many of which may represent potential transcription factor target loci. Figure 5 outlines 
methods developed and optimized for the efficient acquisition of target gene sequence information. 
DNA cross-linked to binding proteins is isolated from tissue culture cells and sonicated to give the 
desired fragment length in preparation for irnmunoprecipitation. Fragments may either be 
immunoprecipitated (IP'd) direcdy or "prelP/IP'd" utilizing an antibody to the large subunit of RNA 
polymerase II that allows for the isolation of only transcribed genes. 

DNA fragments which have been sequentially immunoprecipitated are subsequently run 
through one or more of a series of sequence acquisition options. Cloning of immunoprecipitated 
fragments into bacteriophage particles and/or exon scanning vectors allows for high-throughput 
retrieval of both coding and noncoding individual sequences. Figure 6 illustrates the process of exon 
scanning. In addition, the employment of modified inverse PCR methodology described above 
reveals the possible presence of direct targets for regulation by the transcription factor being studied. 
By assessing the presence of transcription factor binding sites present within the PCR'd and/or 
cloned fragments it is possible to draw conclusions as to the possibility of sequences obtained as 
representing real in vivo targets. This observation is more applicable to factors which have large, 
invariant and therefore rare binding sites. In addition, it is contemplated and therefore covered by 
the present invention that computer programs may be used to analyze sequences obtained in search 
of exons or other revealing attributes such as regulatory elements. 

Finally, internal controls such as known genes, binding site information and gel 
electromobility shift assays (EMSA's) can be used for detailed assessment of signal-to-noise issues 
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(Ausubel et al. (editors), Current Protocols in Molecular Biology , 1996, Chapter 12, p/1-2/5). 

Through implementation of the technology represented by the presently described invention it is 

possible to quickly and efficiently implement a saturation screen for target loci for virtually any 

transcription factor. A demonstration of specific aspects of this technology is evidenced in Figure 7 

for the basal transcription factor RNA Polymerase H A series of positive internal controls such as 

immunoprecipitation with antibodies specific for bistones as well as background minimization 

parameters are incorporated into the presently described technology which results in an unsurpassed 

level of "genome sifting" for genes regulated by transcription factors of either DNA binding or 

nonDNA binding origin. 

Given the fact that background nonspecific chromatin immunoprecipitation could affect the 
ability to define a considerable percentage of the in vivo targets for the transcription factors being 
studied, an assessment of background levels is necessary. Experimental procedures have been 
designed which will evaluate the "signal to noise" ratio for each factor. This is possible by assessing 
within a defined population of immunoprecipitated chromatin the representation of known genes 
regulated by the chosen factor. Therefore, for each factor an extensive Southern blot 
characterization of sample populations of fragments with known target loci as probes may be 
performed. For example, p53 has previously been shown to be a transcriptional regulator of the p21 
locus (El-Dieryetal.,£ell, 1993,75: 817-825). The binding site has been thoroughly mapped and 
sequenced. Thus, it will be possible to utilize this locus as a probe to study in a given population of 
cloned fragments the percentage which hybridize to this particular probe. Calculation of background 
can then be performed by assessing the percentage of the population which represents said known 
target and extrapolation from the predicted number of targets for P 53. For example, by assuming a 
reasonable number of direct targets for P 53 at between 30-50 (i.e. for genes involved primarily in 
regulating proliferation, not apoptosis) it is possible to calculate the efficiency of the system. 

5.7 The Creation of a Compend i um of Transcription Factor Targets 

Application of the improved solid phase sequential chromosomal immunoprecipitation 
technologies combined with standard and modified molecular biology procedures described herein 
allows for the collection of-DNA fragments putatively containing target loci for any of a multitude of - 
both DNA binding and nonDNA binding transcription factors. Said collection of DNA represents 
valuable nucleotide sequence information which may reveal novel gene cascades and ultimately 
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therapeutic targets or strategies for the efficient design of therapeutics to treat a variety of human 

anomalies. The genetic cascades and protein entities encoded by loci representing transcription 

factor target genes will undoubtedly reveal novel mechanisms of therapeutic intervention. 

It is speculated that the collection of nucleotide sequence information obtained through 
implementation of the presently described invention or technologies described herein may be 
organized into a searchable database format. This is particularly applicable with respect to each 
transcription factor or with respect to the discrete realms of human physiology and disease which are 
represented by the transcription factors for which target genes are discovered. Database 
configuration of nucleotide sequence information for the purposes of therapeutic target discovery is 
not a new concept and has proven considerably beneficial to the scientific and medical communities 
(Celera Discovery System™, Celera Genomics, Inc.; Lifeseq™, Incyte Pharmaceuticals, Inc.; 
Deltabase™, Deltagen, Inc.) (Venter et al., Science . 2001, 291(5507): 1304-1351). Thus the 
nucleotide sequences of either coding or noncoding origin (i.e. regulatory elements) discovered 
through implementation of the technology described herein represent a valuable entity which may be 
mined for therapeutic utility via efficient computer algorithms. Programming languages 
contemplated by the present invention which may be utilized to create searchable databases of 
transcription factor target genes include but are in no way limited to C, C+, C++, Visual C++, Basic, 
Visual Basic, Java, Visual Java, Perl and any other program which effectively annotates sequence 
information discovered by implementation of the technology described herein. In addition, it is 
contemplated and therefore covered by the present invention that computer programs may be utilized 
to search or scan sequences obtained by technology described in the present invention for the 
purposes of discovery valuable coding sequence or regulatory information. Use of programs such 
as, but not limited to BLAST, BLASTX, BLASTP, TBLASTN for such purposes are therefore 
contemplated and covered by the present invention (Altschul et al., J. Mol. Biol .. 215: 555-565; 
National Center for Biotechnology Information, Basic Local Alignment Search lool computer 
algorithms and related variations; Ausubel et al. (editors), Current Protocols in Molecular Biology 
1995, Chapter 19, p3/l-3/27). 

Below are described several examples which illustrate application of the presently described 
mvention-for the purposes ©^discovering transcription factor target 'locir'It should in no way be — — 
inferred that the below examples represent the only application of the described technology and 
hence the invention is not to be constrained by these particular examples. 
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6.1 Demonstrati on of Improved Chromosomal Tmmimnprecipitation 

For some transcription factor target genes DNA binding sites of either a direct or indirect 
nature may be located very proximal to the basal transcriptional machinery and transcriptional 
initiation site of target loci. Other sites may be a distance of several kilobases from the promoter 
region and transcriptional initiation site. Therefore the need for generating DNA fragment lengths of 
different sizes represents a crucial aspect of the described technology. By varying fragment length it 
is possible to immunoprecipitate not only DNA molecules containing sites proximal but also distal to 
the transcriptional initiation region. Figure 4 illustrates the ability to "customize" DNA fragment 
length by varying sonication conditions. DNA isolated from cells was sonicated under increasing 
temporal conditions, run on a 1.2% agarose gel in 0.5XTBE along with molecular weight markers 
and stained with ethidium bromide. As the length of time for sonication is increased, it is evident 
that the fragment sizes of crosslinlced DNA become smaller. It is this customizable aspect of the 
described technology which makes is possible to isolate and characterize virtually any transcription 
factor target gene. 

A basic demonstration of the application of the technology described herein focuses on the 
gene II/9-1 of Sciara coprophila (Gabrusewycz-Garcia, N. and Kleinfeld, R.G., Journal of Cell 
Biology , 1966, 29(2): 347-359). Figure 7 reveals the ability to immunoprecipitate target genes 
utilizing antibodies specific for the large subunit of RNA polymerase E. In vivo ChIP assay reveals 
an engaged RNA Pol U at the Sciara coprophila gene U/9-1 promoter during amplification stage of 
larval development. 25 salivary glands from either preamplification or amplification stages of larval 
development were dissected from larvae and incubated in Canon's medium containing 1.0% 
formaldehyde for 15 min at room temperature and for 30 min at 4 degrees C. In vivo fixed isolated 
chromosomal DNA's were recovered from tissues and digested with Hind m enzyme (Hind IJJ 
restriction map of DNA Puff JI/9A locus is given on the panel (A). Released chromatin fragments 
were immunoprecipitated with antibodies raised against Drosophila melanogaster second large 
- subunit of -RNA-Pol H (gAPD-1) (Weeks et al.. Genes-Dev- -?: -2329=2344Vnr with monoclonal-anti-- - 
histone antibodies (Chemicon, Inc.) coated to Dynabeads (Dynal Corporation). The specificity of 
these antibodies against Sciara coprophila protein extracts was analyzed by Western blot and shown 

44 

SUBSTITUTE SHEET (RULE 26) 



WO 02/14550 PCT/US01/24823 

on panel (A) Pellets were washed extensively and freed from cross-links by incubation of pelleted 
cross-linked protein/DNA complexes at 65 degrees C overnight. Purified DNA fragments were 
subjected to 30 cycles of PCR using a primer sets C and B and analyzed on a 2.5% agarose gel. 
Primer set B was chosen as a control to demonstrate the specificity of immunoprecipitation with 
antibodies which recognize the large subunit of RNA polymerase E. Given the fact that binding 
sites for set B are located 2.5kb upstream of the geneII/9-1 promoter and sonication resulted in 300- 
500bp fragments no ORI sequences should be detected. (B) PCR analysis of input DNA' s before 
immunoprecipitation. Equivalent volumes of purified, in vivo formaldehyde-fixed, Hind IH digested 
DNA samples from either preamplification stage (1) or amplification stage (2) without 
immunoprecipitation were freed of cross-links and analyzed by 30 cycles of PCR by primer set C. 
(C) ChIP of cross-linked DNA reveals an engaged RNA Pol E at the promoter of gene H/9-1 only 
during DNA amplification stage (8) but not at the preamplification stage of larval development (7). 
At the same time formaldehyde cross-linked histones are detectable on a promoter containing DNA 
fragment' during both preamplification and amplification stages of larval development (C). Equal 
amounts of cross-linked, Hind Hi-digested DNA material were precipitated either with anti-histone 
antibodies (lanes 1,2,3,4), anti-Pol H antibodies (lanes 7, 8) or subjected to non-immune 
precipitation by magnetic beads as a control to monitor nonspecific precipitation of cross-linked 
complexes (5,6). Samples in lanes 1,2,4,5,6,7,8 were freed of cross-links and 30 cycles of PCR with 
primer set C were done for each sample. The absence of PCR product in lane 3 demonstrates the 
necessity of thermal reversal of the cross-links prior to PCR. Lanes 5 and 6 show that no PCR 
products are detected in non-immune precipitants. (D) Similar ChIP experiments with PCR analysis 
of a distinct genomic region after IP's were done in order to demonstrate completion of Hind HI 
restriction digestion. No products were obtained by PCR amplification with primer set B in the 
samples of anti-RNA Pol II IP for both stages. The level of anti-histone pull down is the same (3, 4) 
as shown by primer set C. 

Multiple rounds of immunoprecipitation may result in the reduction of acquisition of 
significant amounts of DNA template due to loss upon repeated immunoprecipitation and washing. 
Thus, it is necessary to test the recoverability of target genes both and after sequential 
immunoprecipitation. Figure 8 demonstrates both the efficiency and stringency of multiple 
•"-^rnmmop^pitatiorroimds'by'assessmg-'the q^ititative J pfesehce "of the p21" target'genefof the 
transcription factor p53 both before and after sequential IP at very stages of the process (El-Deiry et 
al. Cell. 1993, 75: 817-825). Hela cells were grown to 60% confluency on a 100mm perry dish, 
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irradiated at 0.5 Gry to stimulate a p53 dependent response and incubated for 6 hours at 37 deg. C 
and 7.2% CO z . Cells were cross-linked in 10% Fetal Bovine Serum Medium containing 1.0% 
formaldehyde for 30 minutes at 4 deg. C. Cells were harvested, lysed and chromatin fragment 
length was customized to a length of 50-300bp through implementation of microtip sonication via 
9X15 second pulses of a Branson model 250 sonifier with a 5.0 minute incubation on ice between 
each 15 second pulse. Samples of PCR template were taken at various points during the solid phase 
sequential immunoprecipitation procedure to assess the presence or absence of the p21 target gene. 
p21 sequences were detected only in the sonicated sample prior to immunoprecipitation (sample #1) 
and in the fraction containing cross-linked protein/DNA adducts precipitated by both antibodies 
recognizing the large subunit of RNA polymerase II and holo p53 (sample #5). Semi-quantitative 
PCR demonstrates that very Utile, if any, template is lost after double IP and the implementation of 
extensive washing conditions. In addition, no detection of the p21 gene was observed in the 
supernatant after Poin large subunit immunoprecipitation (sample #2), the residual wash of Poin 
large subunit antibody conjugated beads after immunoprecipitation (sample #3) or the pelleted beads 
after pH adjustment and reversal of antibody/ligand binding (sample #4). 

6.2 Demonstration of Novel p53 Target Gene. Cap ture 

< In order to demonstrate the utility of the presently described invention transcription factor 

target sequences were sought for the mammalian tumor suppressor p53 as mentioned above. 

Specifically, Hela cells exhibiting 60% subconfluency on a 100mm perry dish were subjected to 

0.5Gry and incubated for 6 hours at 37 deg. C, 7.2%C0 2 . Irradiation of cells activates the p53 

response to DNA damage and allows for a characterization of target gene activity. Cells were 

subsequendy cross-linked in 1.0% formaldehyde for 20 minutes, neutralized in lOOmM glycine and 

harvested for lysis. After three rinses in PBS partial lysis was carried out in 200ul lOOmM Hepes, 

pH 7.6, 2mM EGTA, 2mM EDTA, 2.0% Triton X-100 via 25 strokes through a 25G needle. After 

brief centrifugation and removal of supernatant lysis was continued by performing 25 strokes 

through a 25G needle in the presence of 200ul lOOmM Hepes, pH 7.6, 20mM MgC12, 3.0% sarcosyl, 

150mM NaCl. After brief centrifugation and removal of supernatant lysis was completed by 

performing 25 strokes through a 25G needle in the presence of 200ul lOOmM Hepes, pH 7.6, 20mM 

-MgGl, lSOmM-NaGlr-Upon-removal of supernatant-samples were a^ssolved-m lO0ul-10mM^Trisr 
pH 7.6, 5mM EDTA. 
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Chromatin fragment length was customized to a length of 50-300bp through implementation 
of microtip sonication via 9X15 second pulses of a Branson model 250 sonifier with a 5.0 minute 
incubation on ice between each 15 second pulse. Figure 4 illustrates the fragment sizes obtained on 
a 1.2% agarose gel stained with Ethidium Bromide and analyzed under UV fluorescence. 

Solid phase sequential chromosomal immunoprecipitation was performed with 
superparamagnetic tosylactivated Dynabeads (Dynal Corporation) linked to antibodies specific for 
the large subunit of RNA polymerase H and holo p53. Antibodies utilized in the current experiment 
were p53 (FL-393) (cat. #6243, Santa Cruz Biotechnology, Inc.). Fragments containing transcribed 
sequences were first isolated from sonicated chromatin samples through the use of beads coated with 
an antibody to the large subunit of RNA Polymerase H ( cat. #sc-9001, Santa Cruz Biotechnology, 
Inc.). 2ul of antibody coated beads were incubated with 5ul sonicated sample DNA in lOul PBS 
buffer overnight at 4.0 deg. C. It was at this step that the pH was altered to a value of 5.5 to allow 
for the release of cross-linked protein/DNA adducts from the antibody-coated beads. Subsequently 
immunoprecipitation using beads coated with antibodies to p53 was performed as described above. 
After second round chromosomal immunoprecipitation bead/antibody/protein/DNA complexes were 
washed 3 times in 500ul lOOmM Hepes, pH 7.6, 2mM EGTA, 2mM EDTA, 2% Triton X-100, 3.0% 
Empegen (cat. #324690, Calbiochem Corporation). A similar wash was repeated 3 times in the same 
buffer containing 1.0% Empegen. A final wash was subsequently performed in a similar buffer 
without Empegen. Proteinase K treatment was performed on samples for 1.0 hour at 50.0 deg. C by 
standard protocol. DNA was precipitated via the addition of 250ul 100% ethanol, lOul 2.5M NaOAc 
and 2ul Pellet Paint* coprecipitant (cat. #70748-3, Novagen Corp.). 

Figure 9 illustrates the concept of modified inverse PCR (1PCR) for the purposes of defining 
transcription factor target loci in the context of sequential chromosomal immunoprecipitation. PCR 
is possible through the addition of linkers bearing the restriction site and subsequent episomal 
circularization. The success of the application of I-PCR itself suggests that the DNA fragments 
isolated may inherently be direct target genes of the transcription factor being studied, in this case 
p53. In the present example, degenerate oligonucleotide sequences corresponding to the p53 
consensus DNA binding site (RRRCWWGYYYRRRCWWGYYY) linked to an EcoRl restriction 
* site were utilized to perform PGR and obtain flanking nucleotide'sequence'inf ormation (El-Diery et" M= 
al., Nat. Genet ., 1992, 1(1): 45-49). l.Oug of sequentially immunoprecipitated DNA was blunt 
ended in the presence of Klenow fragment (cat. #M0212S, New England Biolabs) and 25uM dNTFs 
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to 1 0 hoTat 37 0 deg. C. After NucTrap™ (cat #400702, Stratagene Corporation) spin column 
purification EcoRl linkers were ligated to the fragments at 4.0 deg. C overnight. Fragments flanked 
byrestriction sites were kinasedinthe presence of T4 polynucleotide kinase (cat. #M0201, New 
England Biolabs) and lOmM ATP and religated for circularization. 50ng of circularized template 
DNA in combination with lOOng each oligonucleotide was subjected to 25 rounds of PCR under the 
following conditions: 98.0 deg. C for 30 seconds, 55.0 deg. C for 30 seconds and 72.0 deg. C for 
seconds PCR fragments were excised from a 1.2% agarose gel, blunted and shotgun subcloned into 
the Smal restriction site of pBluescript SK. Plasmids containing fragments were sequenced via 
Sanger dideoxy sequencing methodology. The presence of the EcoRl linker sequence reveals the 
outermost flanks amplified by the PCR reaction. 

Table 1 reveals two examples of nucleotide sequences obtained by procedures described 
herein Each sequence exhibits high sequence identity to the consensus binding site for P 53 (bold 
letters denote nucleotides fitting the P 53 binding site consensus). Sequence A reveals sumlanty to 
nucleotide sequences present on Homo sapiens chromosome 17, GenBank accession #AC005562. 
Sequence B reveals homology to sequences present in Homo sapiens BAC clone RPH-557N21, 
GenBank accession #AC009242. Both genomic sequences obtained by I-PCR were subcloned 
upstream of a basal promoter linked to the luciferase reporter gene and cotransfected (20ug each) 
with a eukaryotic expression vector containing a cDNA coding for human holo P 53 into Hela cells at 
60% confluency. Cells were subsequently harvested 24 hours after transfection for analysis of 
reporter gene induction. Induction of transcription of the luciferase reporter was observed for both 
sequences as compared to basal levels (see Table 1) thus confirming the identification of novel 
enhancer elements regulable by the transcription factor P 53. The proximity of these regulatory 
elements with respect to transcribed sequences remains to be determined. 
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What is claimed is: 

1. A method which utilizes chromosomal immunoprecipitation procedures for the discovery and 
characterization of transcription factor target genes. 

2. A method according to claim 1 comprising a process of: 

a) attaching a protein binding entity to a support matrix; 

b) utilizing the support matrix/protein binding entity described in a) for the purposes of 
purifying protein/DNA complexes such as chromatin from cell extracts. 

3. A method according to claim 2 wherein said protein binding entity is an antibody. 

4. A method according to claim 1 in which said transcription factors are of a DNA binding 
nature. 

5. A method according to claim 1 in which said transcription factors are recruited to DNA 
through contact with other proteins. 

6. A method according to. claim 1 in which said transcription factor target genes consist of 
coding sequences. 

7. A method according to claim 1 in which said transcription factor target genes consist of 
noncoding sequences, mcluding regulatory elements. 

8. A method according to claim 7 in which said noncoding sequences are regulated by 
transcription factors. 

9. A method according to claim 2 comprising a process of multiple sequential rounds of 
immunoprecipitation utilizing protein binding entities of differing origin which involves the process 
of: 

a) immunoprecipitation of cross-linked protein/DNA complexes utilizing protein binding 
entities specific for one protein followed by; 

b) a second round of immunoprecipitation of protein/DNA complexes utilizing complexes 
isolated by the first round as the substrate and protein binding entities specific for a different protein. 

10. A method according to claim 9 which comprises more than two rounds of chromosomal 
immunoprecipitation. 

11. A method according to claim 9 wherein said protein binding entity is specific for members of 
- the basal transcriptional machinery.- — - - — — — - 
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12. A method which includes utilizing purified DNA fragments isolated by chromosomal 

irnmunoprecipitation procedures described in claim 1 to cross hybridize against libraries of 
nucleotide sequences. 

13. A method which includes utilizing purified DNA fragments isolated by chromosomal 
irnmunoprecipitation procedures described in claim 1 to screen against arrays of nucleotide 
sequences. 

14. A method according to claim 1 which includes the implementation of inverse PCR for the 
discovery of sequences directly bound by said transcription factors. 

15. A method according to claim 1 which includes cloning of purified DNA fragments isolated 
by chromosomal irnmunoprecipitation into vectors for purposes of manipulation and sequence 
determination. 

16. A protein/DNA complex isolated from cells according to methods described in claim 1. 

17. DNA fragments isolated from protein/DNA complexes described in claim 16. 

18. Nucleotide sequences present in DNA fragments described in claim 17 wherein said 
sequences represent noncoding sequences which may include 5 prime and 3 prime untranslated 
regions, introns, promoter, enhancer and/or silencer elements. 

19. Nucleotide sequences present in DNA fragments described in claim 17 wherein said 
sequences represent coding sequences which correspond to specific amino acid sequences present in 
putative proteins. 

20. Amino acid sequences encoded by nucleotide sequences described in claim 19. 

21. Proteins represented by amino acid sequences described in claim 20. 

22. A database of sequence information formed from isolated sequences by methods according to 
claim 1 which represents a cohesive organization of transcription factor target genes. 
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Sequence Listings Declared by the Present Invention 

Sequence A 

5 ' CTTCCCTC ATCCTCC AGC AGGCTAGCTCGGGCTTGGTC ACCTGAC AGGGG3 ' 
Sequence B 

5 ' C AAGAAGGGGACTCGGCTTGTCTGAACTAGC ATA AGGAAGTCCC3 ' 
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